Agent Sprawl and the Two Constraint Modes

James Phoenix

Every agent task needs a termination condition. If the task has ambiguous completion criteria, the agent will sprawl until you run out of tokens.

Author: James Phoenix | Date: March 2026

The Two Modes

There are two fundamentally different ways to constrain coding agents:

Mode 1: Orchestrated Workflows (Programmatic)

Custom prompts, skills, MCP servers, or agent framework code that dictate the workflow step by step.

Prompt A → Tool Call → Prompt B → Verification → Done

Strengths:

Predictable token spend per task
Completion is built into the workflow (the last step is the last step)
Easy to cap iterations, enforce exit conditions
Best for tasks where you know the shape of the answer

Weaknesses:

Rigid. The agent can’t explore outside the programmed path
Front-loads engineering effort into workflow design
Breaks when the task doesn’t fit the template

Mode 2: Free-Form + Deterministic Checks (Search-Based)

Let the agent create freely, then prune with linters, tests, parsers, and verification scripts.

Agent explores freely → Linters/tests reject bad output → Loop until checks pass

Strengths:

Agents can explore, discover, and solve novel problems
No need to anticipate every step in advance
Better for exploratory, creative, or research tasks

Weaknesses:

Checks must be watertight, otherwise the search space is unbounded
Without a completion signal, the agent loops forever
Token spend is unpredictable and can explode

The Danger: Agent Sprawl

Agent sprawl is what happens when an agent runs on a task with no clear termination condition. It keeps iterating, generating, revising, and burning tokens without converging on “done.”

Root cause: The task has ambiguous completion criteria.

Where it hits hardest: Tasks where “good enough” is subjective.

Task Type	Completion Signal	Sprawl Risk
Fix failing test	Test passes	Low
Implement API endpoint	Tests pass + types check	Low
Write PRD / design doc	???	High
Refactor for readability	???	High
Write documentation	???	High
Explore solution space	???	High

Code tasks have natural termination: tests pass, types check, linter is green. Document tasks have no equivalent built-in gate.

The Completion Verification Gap

In Mode 1 (orchestrated), completion is easy. The workflow has a final step. You programmed when to stop.

In Mode 2 (free-form), you must build a completion verifier for every task type. Without one, you’re running an unbounded loop.

For code:

# Natural completion verifier
npm test && npm run typecheck && npm run lint
# All green = done

For docs, you need to invent one:

# Example: PRD completion verifier
./scripts/verify-prd.sh docs/prd.md

# Checks:
# - All required sections present (problem, solution, scope, metrics)
# - Word count within bounds (not too thin, not bloated)
# - No TODO/TBD placeholders remaining
# - Links to design doc exist
# - Acceptance criteria are testable (contain measurable assertions)

If you can’t write a verification script for a task, that task should not be given to a free-form agent. Use Mode 1 or do it yourself.

The Decision Framework

Is the completion criteria deterministic?
  │
  ├── YES → Mode 2 is safe (free-form + checks)
  │         Examples: code, migrations, config changes
  │
  └── NO → Does the task have a known shape?
            │
            ├── YES → Mode 1 (orchestrated workflow)
            │         Examples: PRD from template, changelog from commits
            │
            └── NO → Human-in-the-loop required
                      Examples: architecture decisions, novel design

Practical Rules

Never run free-form agents on doc-phase tasks without a completion verifier. PRDs, design docs, and documentation have no natural stopping point. The agent will revise endlessly.
If using Mode 2, your linters must be watertight. Every gap in your checks is a dimension the agent can sprawl into. Partial linting is worse than no linting because it gives false confidence.
Cap token budgets as a safety net. Even with good checks, set a hard ceiling. If the agent hits the cap, it’s a signal your completion criteria aren’t tight enough.
Doc-phase work should migrate to implementation quickly. The longer you stay in the doc phase, the higher the sprawl risk. Get to code (where termination is natural) as fast as possible.
Completion verification scripts are infrastructure, not overhead. If you’re running agents on a task type repeatedly, build the verifier. It pays for itself in one session.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated

Claude Code + agentic systems

View Book

The Lesson

Burned 50% of a weekly Codex token budget on agent sprawl during doc-phase work. The agents kept revising PRDs and design docs with no termination signal. Code tasks with tests converge naturally. Doc tasks without verifiers never converge.

The fix: either orchestrate doc tasks with a rigid workflow (Mode 1) or build a deterministic completion checker before handing them to a free-form agent (Mode 2). There is no safe third option.

Constraint-First Development – Constraints as specifications
Agent-Driven Development – PRD quality vs. trajectory, guardrails for meta-tasks
Building the Harness – Layered constraint architecture
12 Factor Agents – Micro-agents within deterministic DAGs
Constraint Escalation Ladder – Choosing the right prevention layer

Agent Sprawl and the Two Constraint Modes

The Two Modes

Mode 1: Orchestrated Workflows (Programmatic)

Mode 2: Free-Form + Deterministic Checks (Search-Based)

The Danger: Agent Sprawl

The Completion Verification Gap

The Decision Framework

Practical Rules

Read The Meta-Engineer

The Lesson

Related

Become a better AI engineer

More Insights

Your Own Life Is a Queryable, Validated Corpus

Fabricate The Telemetry Before The Traffic Exists