Agent Sprawl and the Two Constraint Modes

James Phoenix
James Phoenix

Every agent task needs a termination condition. If the task has ambiguous completion criteria, the agent will sprawl until you run out of tokens.

Author: James Phoenix | Date: March 2026


The Two Modes

There are two fundamentally different ways to constrain coding agents:

Mode 1: Orchestrated Workflows (Programmatic)

Custom prompts, skills, MCP servers, or agent framework code that dictate the workflow step by step.

Prompt A → Tool Call → Prompt B → Verification → Done

Strengths:

  • Predictable token spend per task
  • Completion is built into the workflow (the last step is the last step)
  • Easy to cap iterations, enforce exit conditions
  • Best for tasks where you know the shape of the answer

Weaknesses:

  • Rigid. The agent can’t explore outside the programmed path
  • Front-loads engineering effort into workflow design
  • Breaks when the task doesn’t fit the template

Mode 2: Free-Form + Deterministic Checks (Search-Based)

Let the agent create freely, then prune with linters, tests, parsers, and verification scripts.

Agent explores freely → Linters/tests reject bad output → Loop until checks pass

Strengths:

  • Agents can explore, discover, and solve novel problems
  • No need to anticipate every step in advance
  • Better for exploratory, creative, or research tasks

Weaknesses:

  • Checks must be watertight, otherwise the search space is unbounded
  • Without a completion signal, the agent loops forever
  • Token spend is unpredictable and can explode

The Danger: Agent Sprawl

Agent sprawl is what happens when an agent runs on a task with no clear termination condition. It keeps iterating, generating, revising, and burning tokens without converging on “done.”

Root cause: The task has ambiguous completion criteria.

Where it hits hardest: Tasks where “good enough” is subjective.

Task Type Completion Signal Sprawl Risk
Fix failing test Test passes Low
Implement API endpoint Tests pass + types check Low
Write PRD / design doc ??? High
Refactor for readability ??? High
Write documentation ??? High
Explore solution space ??? High

Code tasks have natural termination: tests pass, types check, linter is green. Document tasks have no equivalent built-in gate.


The Completion Verification Gap

In Mode 1 (orchestrated), completion is easy. The workflow has a final step. You programmed when to stop.

In Mode 2 (free-form), you must build a completion verifier for every task type. Without one, you’re running an unbounded loop.

For code:

# Natural completion verifier
npm test && npm run typecheck && npm run lint
# All green = done

For docs, you need to invent one:

# Example: PRD completion verifier
./scripts/verify-prd.sh docs/prd.md

# Checks:
# - All required sections present (problem, solution, scope, metrics)
# - Word count within bounds (not too thin, not bloated)
# - No TODO/TBD placeholders remaining
# - Links to design doc exist
# - Acceptance criteria are testable (contain measurable assertions)

If you can’t write a verification script for a task, that task should not be given to a free-form agent. Use Mode 1 or do it yourself.


The Decision Framework

Is the completion criteria deterministic?
  │
  ├── YES → Mode 2 is safe (free-form + checks)
  │         Examples: code, migrations, config changes
  │
  └── NO → Does the task have a known shape?
            │
            ├── YES → Mode 1 (orchestrated workflow)
            │         Examples: PRD from template, changelog from commits
            │
            └── NO → Human-in-the-loop required
                      Examples: architecture decisions, novel design

Practical Rules

  1. Never run free-form agents on doc-phase tasks without a completion verifier. PRDs, design docs, and documentation have no natural stopping point. The agent will revise endlessly.

  2. If using Mode 2, your linters must be watertight. Every gap in your checks is a dimension the agent can sprawl into. Partial linting is worse than no linting because it gives false confidence.

  3. Cap token budgets as a safety net. Even with good checks, set a hard ceiling. If the agent hits the cap, it’s a signal your completion criteria aren’t tight enough.

  4. Doc-phase work should migrate to implementation quickly. The longer you stay in the doc phase, the higher the sprawl risk. Get to code (where termination is natural) as fast as possible.

  5. Completion verification scripts are infrastructure, not overhead. If you’re running agents on a task type repeatedly, build the verifier. It pays for itself in one session.

    Leanpub Book

    Read The Meta-Engineer

    A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

    Continuously updated
    Claude Code + agentic systems
    View Book

The Lesson

Burned 50% of a weekly Codex token budget on agent sprawl during doc-phase work. The agents kept revising PRDs and design docs with no termination signal. Code tasks with tests converge naturally. Doc tasks without verifiers never converge.

The fix: either orchestrate doc tasks with a rigid workflow (Mode 1) or build a deterministic completion checker before handing them to a free-form agent (Mode 2). There is no safe third option.


Related

Topics
Agent FrameworksAi AgentsLifecycle ManagementTermination ConditionsWorkflow Design

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for Techniques for Overcoming Chat Psychosis Bias

Techniques for Overcoming Chat Psychosis Bias

Chatbots are trained to preserve rapport with the user. Left alone, that trains you into a flattering mirror. These are the prompt-level techniques I use to break the sycophancy gradient and get honest feedback.

James Phoenix
James Phoenix
Cover Image for DRY: Dev Utils Panels Beat Manual State Setup

DRY: Dev Utils Panels Beat Manual State Setup

Every repeated setup ritual is an undeclared API waiting to be formalised. Build the panel once, skip the ritual forever.

James Phoenix
James Phoenix