I write about evaluation systems, agentic workflows, and the engineering tradeoffs of building AI products. Most posts are short notes from work: what I tried, what broke, and what I learned.
Recent writing
-
How to Help a Coding Agent
Coding agents do better when you give them full context, the right tools, and direct feedback from the systems they touch. -
From 0.5 to 2
Where eval has the highest leverage in the product lifecycle, and why developer taste matters more than fast automation. -
How to DO Evals
Why eval matters, the three conditions that make it trustworthy, and the four parts of an end-to-end eval system.
subscribe via RSS