← all writing

I stopped asking my AI to check its work. I made checking unskippable.

Part 2 of the Building Leonard series. Part 1 was about why: my AI assistant lies to me with total confidence, and you can't prompt your way out of that. This one is about how Leonard catches it.

Last time I landed on the idea: don't trust the model's report of reality, check reality with something outside the model. Easy to say. The catch I ran into fast is that a check the model is allowed to skip is a check the model will skip. Not out of spite. It just doesn't think it needs to. It's always sure. If verifying is the polite optional step, the confident guesser sails right past it every time.

So the whole design comes down to one decision: make the checking unskippable. Leonard ended up with two halves, and only one of them is the part that matters.

#The window and the wall

The first half is a window. Leonard runs an MCP server, which is just a standard way to hand an AI a set of tools it can call. Mine are read-only lookups into the truth: is there really a function named NewLoginHandler? Where does it live? What did we already decide about how auth works in this project? The model can open that window any time and ask.

Useful, but on its own, useless. Because asking is optional, and we just covered what the model does with optional. The window helps on the good days. It does nothing on the day the model is confidently wrong, which is the only day I actually needed help.

The second half is the wall, and the wall is the whole point. It's built on hooks. A hook is just a little script Claude Code runs automatically at a fixed moment, whether the model wants it to or not. There's no asking. Before any edit touches a file, my pre-edit hook fires. The model doesn't get a vote.

#What the wall does, step by step

Here's the actual sequence when Claude goes to write a fix.

It composes an edit. Say the edit calls LoginHandler.ServeHTTP. Before that edit lands, the pre-edit hook grabs it, pulls the symbol names out of it (the functions and types it references), and checks each one against an index. The index is the boring, literal heart of the thing: a plain list of every function and type that actually exists in the project, parsed straight from the source code and kept in a small local database. Nothing about the model, nothing about what should exist. Just what does.

If LoginHandler.ServeHTTP is in the index, fine, the edit proceeds. If it isn't, the hook rejects the edit before it happens. And that's the move. Claude never gets to make the change, which means it never gets to turn around and tell me "done, the handler returns the right status now." The confident fiction dies in the doorway instead of in my codebase. The model can be as sure as it likes. The wall doesn't read tone.

After an edit does land, a second hook runs on the way out. It runs a verifier (by default go vet, but you point it at whatever your project actually uses to say "this is sound") and writes the result down. Did the thing the model just claimed to fix actually hold up? That goes into a claim ledger, a running record of what was asserted versus what was true. The confident "done" now has a paper trail, and the claims that never got verified pile up somewhere I can see them instead of dissolving into the chat.

#The three things it keeps

Step back and Leonard is really just keeping track of three kinds of ground truth, and standing between the model and the keyboard to enforce them:

None of that is clever. It's bookkeeping. The clever part, if there is one, is just refusing to let the bookkeeping be optional.

#What it doesn't do, because I learned this the hard way

I'll be straight about the edges. The wall checks things that are mechanically true or false. Does this symbol exist. Does this compile. Does the verifier pass. It does not, and cannot, tell you the code is good, or that it solves the right problem, or that a function which genuinely exists is the right one to call. It shrinks the lying surface. It doesn't replace judgment.

And a wall is only as honest as the person who built it. The part I find funny in hindsight is that I shipped Leonard, trusted it, and then on a later round pointed Leonard's own discipline back at Leonard's own code. It found a bug that re-introduced the exact failure the whole tool exists to prevent: a deleted function still verifying as present. That's the next part of this series, and it's the most humbling thing I've written about in a while.

One quieter design note I'll come back to later: that second hook runs a command on your machine, which should make you a little nervous, and Leonard has a fingerprint check so a tampered config can't slip a command past you. More on that when we get to the bigger idea underneath all of this.

When you let an AI write code, what's the one check you'd never want it to be able to skip?


← all writing