Stephen Kiers - Staff+ Software Engineer

The Gearbots Moment

I was at a Gearbots event with my son recently. If you're not familiar, Gearbots is a makerspace for kids in the lower mainland/Fraser Valley BC—and honestly, one of the best things happening in our community. They're teaching kids to think systematically, debug when something doesn't work, and break big challenges into small ones. I can't recommend them enough.

Anyway. I overheard a group of adults and teenagers complaining about AI. "Hallucinations." "Can't trust it." "It just makes stuff up."

And I realized: this is the "knowing how to Google" problem all over again.

Twenty years ago, getting useful results from search engines was a skill. You learned to phrase queries, recognize spam, filter noise. People who couldn't do that thought search was broken. It wasn't—they just didn't understand the medium.

Now it's context windows and statelessness. Most people don't understand that LLMs start fresh every session. They don't realize that everything they say either helps focus the model or creates noise that drowns out the actual point. They fight the constraints instead of designing around them.

Meanwhile, my son was across the room learning to break problems down, think systematically, persist through frustration. I'm doing the same thing with AI context—designing around constraints rather than fighting them. The skill is the same. The medium changed.

That conversation reminded me of something I wrote two years ago.

The Original Insight

In late 2023, I wrote Cognitive RAM—a piece about treating documentation as memory for a stateless AI. The core argument was simple: ChatGPT starts every session from zero, so instead of fighting that, you build artifacts that give it a running start.

At the time, that meant manually uploading markdown files to ChatGPT before every conversation. Copy-pasting context. Building "hint" files. Branching conversations when I needed to add more background. It was tedious. It worked.

The primitive was right. The tooling caught up.

What Changed

In 2023, "externalized memory" meant uploading files to ChatGPT at the start of every session. Copying and pasting the same context paragraphs. Maintaining parallel hint files by hand. Remembering which files went with which project.

Now the landscape looks different:

Manual uploads → auto-read context files. Tools now read project-level files on startup—JSON, markdown, whatever format. No more remembering to upload. The context is just there.
Copy-paste rituals → persistent infrastructure. Context files live alongside the code. They're versioned. They evolve with the project.
Single conversations → coding assistants. What used to be "paste context into ChatGPT" is now tools that operate inside your editor, reading your codebase, tracking state across sessions.
Experimentation → established patterns. Things I was inventing ad hoc in 2023 now have names and community practices around them.

Though "established" might be generous. Even in the last couple of months, I've changed how I structure these files. The approach is still evolving faster than I can write about it.

Context Rot Is Real (And Now Measurable)

One thing I intuited in 2023 now has research behind it. Chroma published a study (Hong et al., 2025) that tested 18 LLMs and found that models don't use context uniformly—performance grows increasingly unreliable as input length grows.

This matters because the instinct when context isn't working is to add more of it. Bigger context window? Great, dump everything in. But the research shows that's exactly wrong. A million tokens doesn't help if the model can't attend to the right ones. Past a certain point, more context actively degrades output quality.

This validates the externalized memory approach from a direction I didn't expect. The point was never to cram as much information as possible into the model. The point was to give it the right information, structured so it could actually use it. Selective context. Indexed information. Lazy-loading what's relevant instead of dumping everything upfront.

The 2023 instinct—write-once, read-many, keep it focused—turns out to be the right architecture for how these models actually process information.

The Patterns (Evolved)

The manual workflow from 2023 has matured into something more structured. Here's what I use now—described tool-agnostically, because the specific tools change faster than the patterns.

Context Files

Before: Copying and pasting the same 200 words about project architecture into every ChatGPT session.

Now: A file at the root of the project that your tools read automatically every session. Architecture overview, coding conventions, what NOT to do, how to navigate the repo. The format varies—markdown, text, JSON with specific fields. Whatever your tools expect.

This is still the most impactful pattern. One file that answers: "If a new engineer joined today and could only read one document before touching code, what would it say?" Except the new engineer joins every session.

Plan Files

Before: Losing track of what I'd done and what was next between sessions. Starting over because neither I nor ChatGPT remembered where we left off.

Now: A state file that tracks current goals, what's done, what's blocked, what to work on next. Some tools update this automatically as they work. Others I maintain by hand. Either way, the state persists.

The key insight: AI tools themselves do this naturally when working on complex problems—they create and update state files as they go. It's context offloading. We should do the same thing explicitly.

Knowledge Bases

Before: Uploading a massive 40-page document and hoping ChatGPT would find the relevant paragraph.

Now: Folder structures with index files. The AI reads the index, identifies which sub-files are relevant, and loads only those. Lazy-loading, not everything at once.

This is the pattern that scales. It mirrors how human experts work—you don't reload everything you know about a system every time you sit down. You orient, then go deep where needed. Give the AI the same affordance.

Decision Logs

Before: Every new session asking "why did we choose PostgreSQL over DynamoDB?" and getting a generic answer instead of the actual reasoning.

Now: Files that capture reasoning, not just outcomes. "We chose PostgreSQL because X, Y, Z. We considered DynamoDB but rejected it because of A and B." The "why" persists so future sessions don't have to re-derive it.

This is the one most teams skip—and the one that matters most over time. Code tells you what was built. Decision logs tell you why.

What to Externalize

Category	What to Write	Why It Matters	Example
Context	Architecture, conventions, navigation	AI orients itself every session	Markdown file at repo root with project overview
Plans	Current state, goals, blockers	Continuity across sessions	State file with task tracking and session notes
Decisions	Rationale, alternatives considered	Future sessions understand constraints	"Why we chose X over Y" in a decisions folder
Knowledge	Domain context, indexed by topic	Lazy-load relevant detail	Folder with index file mapping topics to sub-files
Patterns	What works, what doesn't, anti-patterns	Prevent repeated mistakes	Style guides, common pitfalls, "don't do X because Y"
Session Memory	Learnings from past interactions	Don't repeat the same corrections	Notes on mistakes the AI made and how to prevent them

The Expertise Connection

I wrote about this in the original article: the real AI skill gap isn't prompting, it's knowing what to write down. That's still true. But expertise now manifests differently.

In 2023, expertise meant knowing what context to upload. In 2025, it also means knowing how to structure that context—what goes in the root file vs. the knowledge base, what's worth indexing vs. what's noise, when to use a plan file vs. embedding state in the context file itself.

The skill gap isn't prompting. It's knowing what to write down—and now, knowing how to structure it so the model can actually use it.

Getting Started

If you're starting from zero, the path is the same one I described in 2023. It still works.

Start with one context file. Put it at the root of your project. Write down what the project is, how it's structured, what to follow, what to avoid. Whatever format your tools expect.
Add a plan file when you lose continuity. When you start a session and can't remember where you left off, that's the signal. Write down current state and next steps.
Add decision logs when you re-explain. If you find yourself re-justifying the same architectural choice to the AI (or to teammates), write it down once.
Add a knowledge base when context gets too big. If your single context file is bloating, split it. Create an index. Let the AI load what's relevant.

Progressive sophistication. Add when it hurts, not before.

Closing

Two years ago, I was uploading markdown files to ChatGPT and branching conversations to manage context. It was clunky. It worked. The insight—that documentation is cognitive RAM for a stateless system—held up better than I expected.

The tools changed. The manual workflow became infrastructure. But the core loop is identical: figure out what the model needs to know, write it down, structure it so the model can use it, update it when it drifts.

The tool that makes AI useful isn't the model. It's the infrastructure sitting next to your code.

This is a follow-up to Cognitive RAM: How I Learned to Work with a Stateless AI (2023), part of the AI & Expertise series.