Projects — Jason Dillingham

Selected applied AI engineering, infrastructure, and ops work. Most of this runs in production — sometimes for me, sometimes for the day job, sometimes for a one-person business. The thing I care about most is shipping end-to-end and operating what I build.

Applied AI / multi-agent systems

[ public OSS :: production ]

bosun — parallel Claude Code orchestration with safety contract

Go CLI + 9-tool MCP server that orchestrates parallel Claude Code sessions across isolated git worktrees. Explicit safety contract: only bosun merge touches main; never pushes, never modifies global git config, never touches files outside the project tree or its named worktrees. Two security audits + three independent review rounds; zero HIGH findings open. The bosun repo dogfoods bosun for its own development — every release ships under bosun coordination. Apache-2.0.

github.com/jasondillingham/bosun →
Read the dogfood writeup →

Tech: Go · MCP · git worktrees · safety contract · multi-OS CI (macOS + Ubuntu + Windows)

[ public OSS :: production ]

Leonard — ground-truth toolkit preventing Claude hallucination

Local-first toolkit that prevents Claude Code from hallucinating over a project's lifetime. Symbol index + decision log + claim ledger exposed through a 10-tool MCP server and enforced through pre/post-edit hooks. The post-edit hook runs the project's verifier (go vet, cargo check, etc.) on every Claude edit and writes the outcome to the claim ledger — an evaluation pipeline for AI-generated code. Path-trust guard rejects file paths resolving outside project root. Verifier trust system gates post-edit command execution by SHA-256 fingerprint. Four focused security review rounds; every CRITICAL and HIGH finding closed. 42 file formats indexed. Apache-2.0.

github.com/jasondillingham/leonard →

Tech: Go · MCP · tree-sitter · pre/post-edit hooks · evaluation pipeline · path-trust security

[ flagship :: production ]

StyxCRM — multi-agent CRM with integrated AI design pipeline

Solo-built in Go. ~50K LOC across ~27 specialized CLI tools, one long-running server, and a job queue that coordinates local (gemma4 via Ollama) and cloud (Claude) inference. SQLite backbone, chi router, HTMX frontend with no build step. Real production state machine across a multi-stage pipeline, full audit log, eval framework for prompt versions, sync daemon between two hosts. Runs in production every day.

Tech: Go · Ollama · Claude · SQLite · chi · HTMX · multi-agent orchestration

[ pipeline :: production ]

demogen — autonomous website generation pipeline

Multi-stage agent pipeline: business research → architect (design blueprint) → demogen (codegen) → review loop with multi-pass scoring. ~78 production websites generated end-to-end. Each output goes through a calibrated eval pass before it's considered shippable. The interesting part isn't any single agent — it's the orchestration patterns and the eval loop that catches regressions.

Tech: Go · Claude · Ollama · eval frameworks · prompt versioning

Infrastructure

[ infrastructure :: production ]

Thor — local LLM inference on commodity hardware

Ryzen 9 3900X + RX 6700 XT (12GB) running Ollama with ROCm. gemma4 in production. Cost-effective alternative to API-only stacks for non-frontier workloads — powers StyxCRM's design and review pipelines at zero per-call cost. Sized deliberately for what a small team or solo operator can actually afford to run.

Tech: Linux · Ollama · ROCm · AMD GPU inference

[ platform :: production ]

Custom MCP servers + audit methodology

Production MCP servers exposing controlled access to homelab state and ops tooling for Claude. Includes a structured audit methodology for security-reviewing one's own MCPs — running three of mine through it back-to-back surfaced concrete fixes I otherwise would have missed.

Read the audit methodology →
Read the homelab MCP build →

Operations / Security

[ ops :: shipped ]

Dispatch — AP workflow built on Outlook categories

Production accounts-payable automation that uses Outlook categories as the actual state machine — meeting users where they already live instead of asking them to learn a new tool. Real users, real money, real edge cases.

Read the writeup →

[ security :: shipped ]

PhishGuard — in-house anti-phishing classifier

Phishing classifier built against the day-job's actual phish corpus rather than a generic dataset. Tuned for the categories of attack we actually see in the wild.

Read the writeup →

Always happy to walk through any of this on a call — contact. Code samples and architecture diagrams available on request.