Agent Sprawl and the Two Constraint Modes

James Phoenix
James Phoenix

Every agent task needs a termination condition. If the task has ambiguous completion criteria, the agent will sprawl until you run out of tokens.

Author: James Phoenix | Date: March 2026


The Two Modes

There are two fundamentally different ways to constrain coding agents:

Mode 1: Orchestrated Workflows (Programmatic)

Custom prompts, skills, MCP servers, or agent framework code that dictate the workflow step by step.

Prompt A → Tool Call → Prompt B → Verification → Done

Strengths:

  • Predictable token spend per task
  • Completion is built into the workflow (the last step is the last step)
  • Easy to cap iterations, enforce exit conditions
  • Best for tasks where you know the shape of the answer

Weaknesses:

  • Rigid. The agent can’t explore outside the programmed path
  • Front-loads engineering effort into workflow design
  • Breaks when the task doesn’t fit the template

Mode 2: Free-Form + Deterministic Checks (Search-Based)

Let the agent create freely, then prune with linters, tests, parsers, and verification scripts.

Agent explores freely → Linters/tests reject bad output → Loop until checks pass

Strengths:

  • Agents can explore, discover, and solve novel problems
  • No need to anticipate every step in advance
  • Better for exploratory, creative, or research tasks

Weaknesses:

  • Checks must be watertight, otherwise the search space is unbounded
  • Without a completion signal, the agent loops forever
  • Token spend is unpredictable and can explode

The Danger: Agent Sprawl

Agent sprawl is what happens when an agent runs on a task with no clear termination condition. It keeps iterating, generating, revising, and burning tokens without converging on “done.”

Root cause: The task has ambiguous completion criteria.

Where it hits hardest: Tasks where “good enough” is subjective.

Task Type Completion Signal Sprawl Risk
Fix failing test Test passes Low
Implement API endpoint Tests pass + types check Low
Write PRD / design doc ??? High
Refactor for readability ??? High
Write documentation ??? High
Explore solution space ??? High

Code tasks have natural termination: tests pass, types check, linter is green. Document tasks have no equivalent built-in gate.


The Completion Verification Gap

In Mode 1 (orchestrated), completion is easy. The workflow has a final step. You programmed when to stop.

In Mode 2 (free-form), you must build a completion verifier for every task type. Without one, you’re running an unbounded loop.

For code:

# Natural completion verifier
npm test && npm run typecheck && npm run lint
# All green = done

For docs, you need to invent one:

# Example: PRD completion verifier
./scripts/verify-prd.sh docs/prd.md

# Checks:
# - All required sections present (problem, solution, scope, metrics)
# - Word count within bounds (not too thin, not bloated)
# - No TODO/TBD placeholders remaining
# - Links to design doc exist
# - Acceptance criteria are testable (contain measurable assertions)

If you can’t write a verification script for a task, that task should not be given to a free-form agent. Use Mode 1 or do it yourself.


The Decision Framework

Is the completion criteria deterministic?
  │
  ├── YES → Mode 2 is safe (free-form + checks)
  │         Examples: code, migrations, config changes
  │
  └── NO → Does the task have a known shape?
            │
            ├── YES → Mode 1 (orchestrated workflow)
            │         Examples: PRD from template, changelog from commits
            │
            └── NO → Human-in-the-loop required
                      Examples: architecture decisions, novel design

Practical Rules

  1. Never run free-form agents on doc-phase tasks without a completion verifier. PRDs, design docs, and documentation have no natural stopping point. The agent will revise endlessly.

  2. If using Mode 2, your linters must be watertight. Every gap in your checks is a dimension the agent can sprawl into. Partial linting is worse than no linting because it gives false confidence.

  3. Cap token budgets as a safety net. Even with good checks, set a hard ceiling. If the agent hits the cap, it’s a signal your completion criteria aren’t tight enough.

  4. Doc-phase work should migrate to implementation quickly. The longer you stay in the doc phase, the higher the sprawl risk. Get to code (where termination is natural) as fast as possible.

  5. Completion verification scripts are infrastructure, not overhead. If you’re running agents on a task type repeatedly, build the verifier. It pays for itself in one session.

    Udemy Bestseller

    Learn Prompt Engineering

    My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

    4.5/5 rating
    306,000+ learners
    View Course

The Lesson

Burned 50% of a weekly Codex token budget on agent sprawl during doc-phase work. The agents kept revising PRDs and design docs with no termination signal. Code tasks with tests converge naturally. Doc tasks without verifiers never converge.

The fix: either orchestrate doc tasks with a rigid workflow (Mode 1) or build a deterministic completion checker before handing them to a free-form agent (Mode 2). There is no safe third option.


Related

Topics
Agent FrameworksAi AgentsLifecycle ManagementTermination ConditionsWorkflow Design

More Insights

Cover Image for ASCII Previews Before Expensive Renders

ASCII Previews Before Expensive Renders

Image and video generation are among the most expensive API calls you can make. A single image render costs $0.02-0.20+, and video generation can cost dollars per clip. Before triggering these renders

James Phoenix
James Phoenix
Cover Image for The Six-Layer Lint Harness: What Actually Scales Agent-Written Code

The Six-Layer Lint Harness: What Actually Scales Agent-Written Code

Rules eliminate entire bug classes permanently. But rules alone aren’t enough. You need the three-legged stool: structural constraints, behavioral verification, and generative scaffolding.

James Phoenix
James Phoenix