May 28, 20264 min read1 read

AFK Development: Supervision, Not Magic

Letting agents build while you sleep works. But the engineering that makes it work is closer to SRE than to prompting.

AFK development is real: I regularly leave a multi-agent job running overnight and wake up to reviewable work. What makes it reliable has almost nothing to do with model intelligence.

The three systems that matter

Budgets. Every task carries a hard ceiling: tokens, wall-clock time, tool calls. An agent that would burn the night on a doomed approach instead hits its budget, writes a summary of what failed, and parks the task for review.

Heartbeats. A supervisor process checks whether each agent is making state progress, not just producing tokens. Producing text while the task state never advances is the agent equivalent of a spinning process, and it is shockingly common.

Checkpoints. Irreversible actions (deploys, deletions, external messages) require a human approval that can be granted from a phone. Everything else proceeds. The trick is drawing that line precisely; too conservative and the AFK property disappears, too loose and you wake up to surprises.

Where this is going

The interesting frontier is not smarter agents. It is better management infrastructure: task graphs that survive agent failure, memory that transfers between team members, and reviews that take minutes instead of hours. That is the platform I am building now.

Autonomous Development
Agents
MCP