Every agent task needs a termination condition. If the task has ambiguous completion criteria, the agent will sprawl until you run out of tokens.
Author: James Phoenix | Date: March 2026
The Two Modes
There are two fundamentally different ways to constrain coding agents:
Mode 1: Orchestrated Workflows (Programmatic)
Custom prompts, skills, MCP servers, or agent framework code that dictate the workflow step by step.
Prompt A → Tool Call → Prompt B → Verification → Done
Strengths:
- Predictable token spend per task
- Completion is built into the workflow (the last step is the last step)
- Easy to cap iterations, enforce exit conditions
- Best for tasks where you know the shape of the answer
Weaknesses:
- Rigid. The agent can’t explore outside the programmed path
- Front-loads engineering effort into workflow design
- Breaks when the task doesn’t fit the template
Mode 2: Free-Form + Deterministic Checks (Search-Based)
Let the agent create freely, then prune with linters, tests, parsers, and verification scripts.
Agent explores freely → Linters/tests reject bad output → Loop until checks pass
Strengths:
- Agents can explore, discover, and solve novel problems
- No need to anticipate every step in advance
- Better for exploratory, creative, or research tasks
Weaknesses:
- Checks must be watertight, otherwise the search space is unbounded
- Without a completion signal, the agent loops forever
- Token spend is unpredictable and can explode
The Danger: Agent Sprawl
Agent sprawl is what happens when an agent runs on a task with no clear termination condition. It keeps iterating, generating, revising, and burning tokens without converging on “done.”
Root cause: The task has ambiguous completion criteria.
Where it hits hardest: Tasks where “good enough” is subjective.
| Task Type | Completion Signal | Sprawl Risk |
|---|---|---|
| Fix failing test | Test passes | Low |
| Implement API endpoint | Tests pass + types check | Low |
| Write PRD / design doc | ??? | High |
| Refactor for readability | ??? | High |
| Write documentation | ??? | High |
| Explore solution space | ??? | High |
Code tasks have natural termination: tests pass, types check, linter is green. Document tasks have no equivalent built-in gate.
The Completion Verification Gap
In Mode 1 (orchestrated), completion is easy. The workflow has a final step. You programmed when to stop.
In Mode 2 (free-form), you must build a completion verifier for every task type. Without one, you’re running an unbounded loop.
For code:
# Natural completion verifier
npm test && npm run typecheck && npm run lint
# All green = done
For docs, you need to invent one:
# Example: PRD completion verifier
./scripts/verify-prd.sh docs/prd.md
# Checks:
# - All required sections present (problem, solution, scope, metrics)
# - Word count within bounds (not too thin, not bloated)
# - No TODO/TBD placeholders remaining
# - Links to design doc exist
# - Acceptance criteria are testable (contain measurable assertions)
If you can’t write a verification script for a task, that task should not be given to a free-form agent. Use Mode 1 or do it yourself.
The Decision Framework
Is the completion criteria deterministic?
│
├── YES → Mode 2 is safe (free-form + checks)
│ Examples: code, migrations, config changes
│
└── NO → Does the task have a known shape?
│
├── YES → Mode 1 (orchestrated workflow)
│ Examples: PRD from template, changelog from commits
│
└── NO → Human-in-the-loop required
Examples: architecture decisions, novel design
Practical Rules
-
Never run free-form agents on doc-phase tasks without a completion verifier. PRDs, design docs, and documentation have no natural stopping point. The agent will revise endlessly.
-
If using Mode 2, your linters must be watertight. Every gap in your checks is a dimension the agent can sprawl into. Partial linting is worse than no linting because it gives false confidence.
-
Cap token budgets as a safety net. Even with good checks, set a hard ceiling. If the agent hits the cap, it’s a signal your completion criteria aren’t tight enough.
-
Doc-phase work should migrate to implementation quickly. The longer you stay in the doc phase, the higher the sprawl risk. Get to code (where termination is natural) as fast as possible.
-
Completion verification scripts are infrastructure, not overhead. If you’re running agents on a task type repeatedly, build the verifier. It pays for itself in one session.
The Lesson
Burned 50% of a weekly Codex token budget on agent sprawl during doc-phase work. The agents kept revising PRDs and design docs with no termination signal. Code tasks with tests converge naturally. Doc tasks without verifiers never converge.
The fix: either orchestrate doc tasks with a rigid workflow (Mode 1) or build a deterministic completion checker before handing them to a free-form agent (Mode 2). There is no safe third option.
Related
- Constraint-First Development – Constraints as specifications
- Agent-Driven Development – PRD quality vs. trajectory, guardrails for meta-tasks
- Building the Harness – Layered constraint architecture
- 12 Factor Agents – Micro-agents within deterministic DAGs
- Constraint Escalation Ladder – Choosing the right prevention layer

