Stephen Kiers - Staff+ Software Engineer

Series: AI & Expertise (3 of 3)

I've spent the last two posts explaining why expertise matters when working with AI tools. First, a backyard engineering disaster where I couldn't catch ChatGPT's correct answers because I didn't have the expertise to recognize which one mattered. Then, why LLMs need supervision, not worship—and why the productivity gap comes from code review skills, not prompting skills.

This post is the practical payoff. OK, so expertise matters. ChatGPT is stateless. Now what? How do you actually work with this thing effectively?

The answer is simpler than you'd expect.

The Cold Start Problem

Every ChatGPT session starts pure.

The model doesn't remember your last conversation. It doesn't know your codebase. It doesn't know why you chose PostgreSQL over DynamoDB, why that service is structured the way it is, or what broke the last time someone touched the billing module.

Most engineers fight this. They write longer prompts. They paste in more context. They re-explain the same architecture, the same constraints, the same decisions—every single session. And then the context window fills up, the model starts losing the thread, and the output quality drops off a cliff.

Future Steve here (2025):

This turned out to be measurable. Chroma's research (Hong et al., 2025) tested 18 LLMs and found that models don't use context uniformly—performance gets increasingly unreliable as input length grows. Bigger context windows don't solve this. A million tokens doesn't help if the model can't attend to the right ones. More on this in the follow-up.

The instinct to fight statelessness is wrong. You don't extend context. You externalize memory.

Documentation as Cognitive RAM

Here's the reframe: your project documentation isn't just for humans anymore. It's cognitive RAM for a stateless system.

Documentation written for humans gets read occasionally. Cognitive RAM gets uploaded to ChatGPT at the start of every session. Different purpose, different design.

When I start a ChatGPT session for a project, I upload a file that tells it how to navigate the codebase—what the architecture is, where things live, what patterns to follow, what to avoid. I upload another file that tracks what I've been working on and what's next. If there's relevant domain context, I upload that too.

I wrote these files once. ChatGPT reads them at the start of every session. Write-once, read-many. That's the primitive.

The Manual Workflow

I'm not going to pretend this is elegant. Here's what it actually looks like right now.

Every time I start a new ChatGPT conversation about a project, I upload files. A markdown file with the architecture overview. A file with current state—what's done, what's in progress, what's blocked. Sometimes a file with domain context that would take twenty messages to re-explain.

Inside these files, I include instructions for ChatGPT itself: "If you need more context about the database schema, ask me." "If you're unsure about a convention, ask before generating code." And there is a LOT of "if you need more context, ask." Because the alternative is ChatGPT confidently guessing—and guessing wrong.

For bigger documents—API specs, long decision histories—I ask ChatGPT to parse and summarize them into "hint" files. Compressed versions that capture the essential context in fewer tokens. Then I upload the hint file in future sessions instead of the full document. Manual knowledge distillation.

I copy-paste context into every conversation. I branch conversations when I need to add more context iteratively—start a thread, realize it needs more background, fork a new conversation with the missing pieces included upfront.

It's overhead. Real overhead. Uploading files, maintaining files, remembering which files to upload for which project, keeping the hint files in sync with reality. Some days it feels like I'm spending as much time managing context as I am actually working.

But here's the thing: even with all that friction, the sessions where I do this are dramatically better than the sessions where I don't. The difference between a ChatGPT that knows your project and one that's starting from zero is night and day.

I'm convinced this approach matters. The tooling is painful right now—I expect it will improve. But the core idea of giving the model structured context, of treating documentation as something the AI reads every session rather than something humans read occasionally? That's the insight. The manual workflow is just the current cost of entry.

The Expertise Connection

This is where the series comes together.

In the backyard disaster, I couldn't build the right mental model because I lacked structural engineering expertise. I asked reasonable questions—I just couldn't recognize which answers were relevant.

In the supervision post, I argued that seniors get leverage from AI because they can review output—they have the scar tissue to know what good looks like.

Externalized memory is the practical expression of that expertise. When a senior engineer writes a context document for ChatGPT, they know what to include because they've onboarded people before. They've seen what happens when a new engineer doesn't understand the architecture. They've watched junior developers make mistakes that a two-paragraph context document would have prevented.

Juniors may not be able to build this yet. Not because they're not smart, but because they're still accumulating the judgment about what's worth persisting. You can't externalize knowledge you haven't built. That's not a criticism. It's just how expertise works.

This is the real AI skill gap. It's not prompting. It's knowing what to write down.

The Maintenance Question

Let's be honest: maintaining these files is overhead.

Context documents go stale. Project state drifts from reality. If you've ever worked at a company with a wiki, you know the failure mode—six months in, half the pages are wrong and nobody trusts any of them.

So why does this work?

The feedback loop is immediate. Wiki pages are written for humans who might read them someday. These files are uploaded to ChatGPT every session. If your context file is wrong, you get bad output today, not in six months. You feel the pain immediately and fix it.

The scope is smaller. You're not documenting everything. You're documenting what ChatGPT needs to be useful in this specific project. That's a much smaller surface area than "all institutional knowledge."

You're both author and consumer. You write the context file, you experience ChatGPT's output, and you update the file. The incentive to maintain it is direct and personal.

Is it zero overhead? No. But the alternative—re-explaining the same context every session, watching ChatGPT make the same mistakes because it doesn't know what you know—costs more.

Before and After

Without externalized memory:

Every session starts with "here's my project structure..."
You paste the same context into every conversation
ChatGPT makes mistakes you've corrected before
Context windows fill up with re-explanations, leaving less room for actual work
You feel like you're training a new junior every day

With externalized memory:

Sessions start with ChatGPT already oriented
Context is uploaded once, not re-explained
Past decisions and reasoning persist across sessions
You work on problems, not on re-establishing context
The model gets better outputs because it has better inputs

The tool didn't change. The context did.

Getting Started

If you take one thing from this post: start with a context file.

Create a markdown file for your project. Write down:

What this project is
How it's structured
What conventions to follow
What to avoid

That's it. One file. Upload it at the start of every ChatGPT session about this project. Update it when you notice ChatGPT doing something dumb that a sentence of context would have prevented.

Everything else—state tracking files, hint files, domain context—comes later, when you feel the need. Start with the minimum. Add when it hurts.

The Takeaway

ChatGPT is stateless. That's not a limitation to fight—it's a design constraint to engineer around.

The engineers getting the most from AI aren't the ones with better prompts. They're the ones who've built persistent artifacts—context files, project state documents, domain summaries—that give every session a running start.

Documentation is cognitive RAM. Artifacts beat prompts. And the expertise to know what's worth writing down? That's the real skill gap.

The tool that makes AI useful isn't the model. It's the file you upload before you start talking.

What does your externalized memory system look like?

Update (2025): The tooling did improve. I wrote about what changed in Cognitive RAM, Two Years Later.