Sprint Contracts — Negotiated Agent Agreements
Why contracts matter
High-level product specs are intentionally vague about implementation details — specifying too much upfront risks cascading errors when the spec gets something wrong. But an evaluator needs concrete criteria to grade against. Sprint contracts bridge this gap: they are specific enough to test against but written just before implementation rather than at planning time.
Source: Harness design for long-running application development (Prithvi Rajasekaran, Anthropic, March 2026)
The negotiation flow
- Generator proposes — what it will build in this sprint and how success will be verified
- Evaluator reviews — checks whether the generator is building the right thing relative to the spec
- Iterate — the two agents go back and forth until they agree on the contract
- Generator builds — implements against the agreed contract
- Evaluator grades — tests against the contract’s criteria
Communication happens via files: one agent writes a file, the other reads it and responds within that file or with a new one. No shared context window needed.
Granularity in practice
Contracts can be very granular. In one example, Sprint 3 alone had 27 criteria covering the level editor. The evaluator’s findings were specific enough to act on without extra investigation:
| Contract criterion | Evaluator finding |
|---|---|
| Rectangle fill tool allows click-drag to fill area | FAIL — Tool only places tiles at drag start/end points instead of filling the region |
| User can select and delete entity spawn points | FAIL — Delete key handler requires both selection and selectedEntityId to be set, but clicking an entity only sets one |
| User can reorder animation frames via API | FAIL — PUT /frames/reorder route defined after /{frame_id} routes; FastAPI matches “reorder” as a frame_id integer |
These are implementation-level bugs caught by contract-level criteria — the contract translated high-level user stories into testable behaviors that the evaluator could exercise through browser automation.
The spec cascade problem
The planner intentionally avoids specifying granular technical details. If the planner gets implementation details wrong upfront, errors cascade into downstream work. Sprint contracts solve this by deferring implementation specifics to the moment before building, when the generator has the most context about the current state of the codebase.
This mirrors a well-known software engineering principle: defer decisions to the last responsible moment. The sprint contract is that moment for agent harnesses.
Related Notes
- GAN-Inspired Agent Architecture - Generator Evaluator Loops — the three-agent architecture where sprint contracts operate
- Spec-Driven Development and AI-Native SDLC - 2026 Analysis — specs as the planning layer; sprint contracts are the implementation-level complement
- Agent Harnesses — sprint contracts are a harness pattern for decomposing work