Process

Two threads, one repo. Files carry the program.

I use AI agents as execution lanes, not as project memory. Control decides the next slice, workers return proof, and durable files carry the program from one fresh thread to the next.

Control decides next slice

The control lane owns sequencing. It reads the repo state, validates what is true, and chooses the next smallest meaningful slice.

This is where judgment happens. The model is not asked to wander through the project or infer a roadmap from chat memory. Control turns the current plan into one specific unit of work.

Control writes bounded prompt

The prompt is the contract: objective, scope, non-goals, validation commands, and the definition of done.

The execution lane should not negotiate scope, reopen accepted work, or guess at missing policy. The prompt carries only what the worker needs to complete this slice.

Worker receives one slice, no prior context

The worker enters as a disposable execution session. One slice, one spec, no inherited project memory.

That isolation is the point. It prevents stale assumptions from becoming implementation policy and keeps the worker focused on producing the requested artifact.

Worker returns diff and tests

The worker returns code, tests, validation output, known risks, and a clear completion state.

A passing command is useful, but it is not the decision. The returned diff is evidence for review, not permission to advance.

Control audits returned work

Control reviews the returned work against the slice spec, type safety, accessibility, tests, build health, and proof gaps.

The default assumption is that issues exist until proven otherwise. Review starts with findings, not with a success summary.

Acceptance gate blocks advancement

If any issue is found, control summarizes it plainly, recommends a path, and waits for explicit human go or no-go.

There is no silent accepted-with-risk state. Quality beats speed because the agent is already fast; the scarce resource is correctness.

PLAN.md and BUILDLOG.md record continuity

Accepted work is written back into durable project state so the next lane can restart from files alone.

PLAN.md states what is complete, next, deferred, and off-limits. BUILDLOG.md records the decision and acceptance status. The chat window is no longer the memory system.

Fresh control lane reloads from durable docs

When context degrades, a new control lane reads the control state bundle and resumes without hidden chat memory.

No session is precious. Durable docs, tests, and accepted artifacts carry the program. A fresh lane can decide the next authorized move from the repo.

process loop01 / 08

AGENTS.mdoperating contract

PLAN.mdcurrent plan

slice promptbounded handoff

worker lanedisposable session

diff + testsreturned proof

findings.mdacceptance gate

BUILDLOG.mdcontinuity log

Active state

Next slice selected

The control lane owns sequencing. It reads the repo state, validates what is true, and chooses the next smallest meaningful slice.

Control lane

Reads PLAN.md and decides one bounded move.

Worker lane

No worker is active yet.

Repo state

AGENTS.md and PLAN.md constrain the decision.

01Control decides next slice

02Control writes bounded prompt

03Worker receives one slice, no prior context

04Worker returns diff and tests

05Control audits returned work

06Acceptance gate blocks advancement

07PLAN.md and BUILDLOG.md record continuity

08Fresh control lane reloads from durable docs

Acceptance gateNo gate yet. The slice has not been issued.

Control decides next slice

The control lane owns sequencing. It reads the repo state, validates what is true, and chooses the next smallest meaningful slice.

This is where judgment happens. The model is not asked to wander through the project or infer a roadmap from chat memory. Control turns the current plan into one specific unit of work.

Control: Reads PLAN.md and decides one bounded move.
Worker: No worker is active yet.
Repo: AGENTS.md and PLAN.md constrain the decision.

Control writes bounded prompt

The prompt is the contract: objective, scope, non-goals, validation commands, and the definition of done.

The execution lane should not negotiate scope, reopen accepted work, or guess at missing policy. The prompt carries only what the worker needs to complete this slice.

Control: Freezes scope, blind spots, non-goals, and validation.
Worker: Still inactive. It has not received the packet.
Repo: The prompt points back to durable artifacts, not chat history.

Worker receives one slice, no prior context

The worker enters as a disposable execution session. One slice, one spec, no inherited project memory.

That isolation is the point. It prevents stale assumptions from becoming implementation policy and keeps the worker focused on producing the requested artifact.

Control: Waits for a bounded result instead of co-authoring the code.
Worker: Receives the slice prompt and implements within it.
Repo: The repo is the only durable context the worker can rely on.

Worker returns diff and tests

The worker returns code, tests, validation output, known risks, and a clear completion state.

A passing command is useful, but it is not the decision. The returned diff is evidence for review, not permission to advance.

Control: Has a concrete diff to inspect.
Worker: Reports changes, validation, blockers, and risks.
Repo: The diff and tests become inspectable repo state.

Control audits returned work

Control reviews the returned work against the slice spec, type safety, accessibility, tests, build health, and proof gaps.

The default assumption is that issues exist until proven otherwise. Review starts with findings, not with a success summary.

Control: Checks the diff against the definition of done.
Worker: No longer owns sequencing. It can receive corrections only.
Repo: Findings are written against files, tests, and visible behavior.

Acceptance gate blocks advancement

If any issue is found, control summarizes it plainly, recommends a path, and waits for explicit human go or no-go.

There is no silent accepted-with-risk state. Quality beats speed because the agent is already fast; the scarce resource is correctness.

Control: Accepts, rejects, or holds for human decision.
Worker: May receive a correction prompt if the slice is rejected.
Repo: findings.md is the gate record, not a notification.

PLAN.md and BUILDLOG.md record continuity

Accepted work is written back into durable project state so the next lane can restart from files alone.

PLAN.md states what is complete, next, deferred, and off-limits. BUILDLOG.md records the decision and acceptance status. The chat window is no longer the memory system.

Control: Updates the source of truth before moving on.
Worker: Disposable. Its useful output now lives in the repo.
Repo: PLAN.md and BUILDLOG.md carry the program forward.

Fresh control lane reloads from durable docs

When context degrades, a new control lane reads the control state bundle and resumes without hidden chat memory.

No session is precious. Durable docs, tests, and accepted artifacts carry the program. A fresh lane can decide the next authorized move from the repo.

Control: Reloads from AGENTS.md, PLAN.md, BUILDLOG.md, and architecture notes.
Worker: No worker is open until the next slice is authorized.
Repo: The files are the program state.

Proof standard

What the bar catches

The market dashboard made this concrete. The default backtest an agent produces is a lie. Given a macro dashboard spec, it uses the latest-revised series for each historical date. The chart runs, the tests pass, the numbers look right. The numbers are wrong. A value from 1995 was not that value in 1995. It has been revised nine times since.

The pipeline I shipped enforces a single invariant: every number on the screen was computed using only information available at that point in time. Vintage data only. Revisions timestamped. No lookahead. The bar had to be declared upfront, as a property of the system and not a note in a pull request. Without that, the dashboard would have shipped as confident fiction.

Proven in production

Where the standard was forged

This process was forged shipping two production systems independently. A bridal operations platform running on staff tablets. A market dashboard that has to stay honest across fifty years of macroeconomic data. Both taught me the same thing. The model is the execution engine. The rigor has to be mine.