My AI coding assistant kept lying to me. So I built Leonard.

2026-06-10 · ~~6 min read · ai · mcp · go · claude-code · leonard

First in a short series on building Leonard, the local-first tool I made to keep Claude Code honest against my actual code.

The first time it really got me, I'd asked Claude to fix a bug. It thought for a second, made an edit, and told me "Done. The handler returns the right status now." So I went to look. The function it said it had changed didn't exist. Not anymore, not ever. It had described, in detail, work it did on a thing that wasn't there, in the exact same calm tone it uses for everything else.

That's the part that stuck with me. Not that it was wrong. Everybody's wrong sometimes, me more than most. It's that it was wrong with a completely straight face. No hedge, no "I think," no "you might want to double-check this." Just done, boss.

And once I started watching for it, I saw it everywhere. It'd call a helper that used to exist before I renamed it. It'd swear the tests passed on code it hadn't actually run. It'd reference a config key it wished were there, because a key like that should be there. Little confident fictions, sprinkled into otherwise good work, and no flag on any of them telling me which parts to trust.

#It isn't lying, exactly, and that's what makes it worse

Here's the thing I had to get straight in my own head. The model isn't lying the way a person lies. There's no intent behind it. It's a very good pattern-matcher that has no real connection to whether the thing it's saying is true. It says the function exists because, statistically, a function with that name fits right there. It says the edit worked because edits usually work. It was trained to sound sure, so it sounds sure either way, whether it's looking at solid ground or thin air.

A person who lies at least knows the truth and is choosing to hide it. This is worse in a way. The model genuinely cannot tell you which of its claims are real, because it doesn't know either. It has no ground truth. It has a very confident guess about what ground truth probably looks like.

When you're moving fast, leaning on it to do the heavy lifting, that's a quiet little landmine in every session.

#The fix I tried first was the obvious wrong one

So I did what everybody does. I tried to prompt my way out of it. "Don't make things up." "Check that the function exists before you call it." "Tell me when you're not sure." I wrote increasingly stern instructions, the way you'd talk to a contractor who keeps cutting corners.

It worked for about a day. Then it drifted right back, because a prompt is a suggestion, and I was handing suggestions to a system that treats them more like a mood than a rule. You can't instruct your way out of a problem that's baked into how the thing works. Asking a model that has no ground truth to please stop guessing is just asking it to guess more politely.

I kept at that longer than I'd like to admit. Eventually it clicked that I was solving it from the wrong end.

#Stop asking the model to be honest. Check its work against reality.

The move wasn't to make the model trustworthy. It was to put something next to it that already knows the truth, and make every claim pass through that thing first.

If Claude says it's about to use a function, don't take its word for it. Go look in the actual code and see if that function is really there. If it says an edit compiles, don't believe the summary, run the compiler and read the result. Don't trust the model's report of reality. Check reality, with something outside the model that can't be talked into a different answer.

That's the whole idea behind Leonard. It keeps a plain index of every function and type that actually exists in the project, pulled straight from the source. Before an edit lands, a hook (just a little script that runs first) takes the names Claude wants to use and checks them against that index. If the index has no such thing, the edit gets stopped before Claude ever gets to say "done." It also keeps a running log of what the model claimed versus what actually happened, so the confident fictions have somewhere to get caught instead of sailing straight into my codebase.

It's a tool I built for myself and use every day. It's not magic, and it didn't come out clean the first time, or the fifth. But the core bet has held up: the model can't fabricate an API surface it wishes were there, because something boring and literal is standing between it and the keyboard, checking.

In the next part I'll get into how that actually works, the index, the hooks, and the trick of making honesty the only path through instead of a polite request. And later in the series, the genuinely humbling bit: what happened when I pointed Leonard at its own code and it found things I really did not want to find.

What's the thing your AI tool told you it had done, in that same confident voice, that turned out to never have happened at all?

← all writing