Sprint Contracts — Negotiated Agent Agreements

Why contracts matter

High-level product specs are intentionally vague about implementation details — specifying too much upfront risks cascading errors when the spec gets something wrong. But an evaluator needs concrete criteria to grade against. Sprint contracts bridge this gap: they are specific enough to test against but written just before implementation rather than at planning time.

Source: Harness design for long-running application development (Prithvi Rajasekaran, Anthropic, March 2026)

The negotiation flow

Generator proposes — what it will build in this sprint and how success will be verified
Evaluator reviews — checks whether the generator is building the right thing relative to the spec
Iterate — the two agents go back and forth until they agree on the contract
Generator builds — implements against the agreed contract
Evaluator grades — tests against the contract’s criteria

Communication happens via files: one agent writes a file, the other reads it and responds within that file or with a new one. No shared context window needed.

Granularity in practice

Contracts can be very granular. In one example, Sprint 3 alone had 27 criteria covering the level editor. The evaluator’s findings were specific enough to act on without extra investigation:

Contract criterion	Evaluator finding
Rectangle fill tool allows click-drag to fill area	FAIL — Tool only places tiles at drag start/end points instead of filling the region
User can select and delete entity spawn points	FAIL — Delete key handler requires both `selection` and `selectedEntityId` to be set, but clicking an entity only sets one
User can reorder animation frames via API	FAIL — `PUT /frames/reorder` route defined after `/{frame_id}` routes; FastAPI matches “reorder” as a frame_id integer

These are implementation-level bugs caught by contract-level criteria — the contract translated high-level user stories into testable behaviors that the evaluator could exercise through browser automation.

The spec cascade problem

The planner intentionally avoids specifying granular technical details. If the planner gets implementation details wrong upfront, errors cascade into downstream work. Sprint contracts solve this by deferring implementation specifics to the moment before building, when the generator has the most context about the current state of the codebase.

This mirrors a well-known software engineering principle: defer decisions to the last responsible moment. The sprint contract is that moment for agent harnesses.

GAN-Inspired Agent Architecture - Generator Evaluator Loops — the three-agent architecture where sprint contracts operate
Spec-Driven Development and AI-Native SDLC - 2026 Analysis — specs as the planning layer; sprint contracts are the implementation-level complement
Agent Harnesses — sprint contracts are a harness pattern for decomposing work