The Two Camps of Agentic Coding

James Phoenix
James Phoenix

One camp talks to models. The other camp specifies systems. The second camp is where the real leverage lives.

Author: James Phoenix | Date: March 2026


The Divide

There are two competing paradigms in agentic coding right now. Most engineers have picked a side without realising there are sides.

Camp 1: Conversational. Talk to it, iterate, refactor later. Code emerges from dialogue.

Camp 2: Spec-driven. Specify behaviour, generate implementation, verify correctness. Code is a compilation artifact.

The “just talk to it” philosophy has become the dominant voice in agentic engineering circles. Run 3-8 parallel agent instances, prompt with 1-2 sentences plus a screenshot, iterate your way to features. It works. Solo engineers ship 300k+ LOC apps this way.

But there is a structural ceiling to this approach. And it shows up exactly where systems get hard.


Camp 1: Conversational Engineering

The workflow looks like this:

Human intent
    ↓
Prompt (1-2 sentences + screenshot)
    ↓
Agent explores repo
    ↓
Agent writes code
    ↓
Human observes + corrects
    ↓
Refactor later

The hidden assumption: correctness emerges from iteration.

Proponents say it directly. Under-spec your requests, watch the model build, queue changes, morph chaos into shape. Spend 20% of your time on refactoring days. Spec-driven development is “the old way of thinking.”

Where this works

Condition Why it works
Solo engineer No coordination overhead
< 500k LOC Agent can hold enough context
UI-heavy products Visual feedback loop is fast
Single codebase No distributed state to reason about
Strong intuition Human pattern-matches faster than specs
Fast iteration cycles Cheap to fix mistakes

The best conversational engineers are genuinely skilled at this. Intuition for “blast radius,” discipline around atomic commits, knowing when to stop a model mid-run. These are real skills.

But the system has limits.


Where Conversation Breaks Down

Iteration becomes expensive when systems are:

  • Distributed (state lives in multiple places)
  • Stateful (race conditions, failure modes)
  • Safety-critical (correctness is non-negotiable)
  • Multi-agent (coordination between autonomous workers)
  • Financially critical (bugs cost money, not just time)

You cannot iterate your way to correct concurrent state management. You cannot “vibe code” a distributed transaction. You cannot refactor your way out of a race condition you never specified.

The problem is not that conversation is bad. The problem is that conversation does not produce specifications. And without specifications, you cannot reason about:

  • State transitions
  • Failure modes
  • Race conditions
  • Invariants
  • Correctness guarantees

That 20% refactoring time is a tax you pay for skipping specs. It compounds. Not linearly, but as system complexity grows, the refactoring cost grows with it.


Camp 2: Spec-Driven Engineering

The workflow inverts the hierarchy:

Specification (PRD + Design Doc)
    ↓
Constraints (types, tests, invariants)
    ↓
Agent generates implementation
    ↓
Verification (tests pass, invariants hold)
    ↓
Human reviews architecture, not syntax

The core insight: specs + tests are the code. Implementation is a compilation artifact.

This is not a new idea. It comes from formal verification, theorem proving, dependently typed languages. What changed is that LLMs make it practical.

The math

Given:

  • Spec S (behaviour you want)
  • Implementation I (code the agent writes)
  • Correctness: I |= S (implementation satisfies spec)

Tests are partial proofs of this relationship. The more constraints you encode upfront, the smaller the search space for the agent.

PRD Quality vs Trajectory

Better specs reduce random search. The relationship follows a log curve:

Trajectory Accuracy
        ↑
        │                          day 4+  ← Extremely accurate
        │                     day 3 ·
        │                day 2 ·
        │           ·
        │       ·
        │    ·
        │  ·
        │·  day 1
        │
        └─────────────────────────────────→ Days on PRD + Design Doc
          ↑
     Random walk
Time on spec Trajectory alignment
0 (just talk to it) ~30-40%
1 day ~60%
3 days ~80%
5 days ~90%

That first day of spec work provides massive alignment gains. Skipping specs entirely means agents wander randomly. This is real computer science, not a model limitation.


Why Camp 1 Rejects Specs

Because models used to be bad at following them.

Old workflow (2024):

spec → model → garbage

So people switched to:

conversation → code

This was rational at the time. Sonnet 3.5 could barely follow a multi-page spec. GPT-4 would hallucinate interfaces that didn’t exist.

But with GPT-5-class and Sonnet 4.x-class agents, we are approaching:

spec → correct code

Which changes the economics entirely.

Most conversational engineers built their intuition in the “models can’t follow specs” era. Their workflow is optimised for that constraint. But the constraint is dissolving.


The Three-Layer Architecture

The future converges on three layers:

┌─────────────────────────────────────────┐
│  Layer 1: Intent                         │
│  PRD, architecture docs, invariants      │
│  (What the system MUST do)               │
├─────────────────────────────────────────┤
│  Layer 2: Constraints                    │
│  Types, tests, contracts, guard rails    │
│  (What correctness LOOKS like)           │
├─────────────────────────────────────────┤
│  Layer 3: Generated Code                 │
│  Implementation artifact                 │
│  (What agents produce)                   │
└─────────────────────────────────────────┘

Agents operate between Layer 2 and Layer 3. Humans own Layer 1 and Layer 2.

This maps directly onto the harness model. Each layer amplifies signal and attenuates noise. The LLM is the least controllable part. Everything else is engineering.


The Signal Processing View

Camp 1 treats the LLM as a collaborator you steer through dialogue.

Camp 2 treats the LLM as a noisy channel you constrain through specifications.

Both are valid mental models. But signal processing tells us something important: you get better output by constraining the channel, not by sending more messages through it.

The conversational approach increases signal through iteration volume. Spec-driven engineering increases signal through constraint density. The second approach scales. The first eventually hits a ceiling.

Camp 1: More iterations → better output (linear)
Camp 2: Better constraints → better output (multiplicative)

It’s Prompts All the Way Down

Everything in agent-driven development reduces to the same primitive: a prompt that produces work.

Vision (prompt)
  → PRDs (prompts that define what)
    → Design Docs (prompts that define how)
      → Tasks (prompts agents execute)
        → Code, tests, docs (output)

A PRD is a prompt. A task is a prompt. A design doc is a prompt. The only difference is scope and audience. Each layer generates the layer below it.

Conversational engineering collapses this hierarchy. It goes directly from vision to agent execution, relying on iteration to fill the gaps. This is fast. It is also lossy.

Spec-driven engineering preserves the hierarchy. Each layer encodes information that constrains the next. Less information is lost. Fewer iterations are needed.


The Synthesis

The strongest workflow is not pure Camp 1 or Camp 2. It combines both.

Conversation → design exploration (Camp 1)
        ↓
Specification (Camp 2)
        ↓
Agent implementation (Camp 2)
        ↓
Iterative polish (Camp 1)
        ↓
Verification (Camp 2)

Conversation is a spec discovery tool. Not the source of truth.

Use dialogue to explore the problem space. Use specs to encode what you learned. Use agents to compile the spec into code. Use tests to verify correctness. Use iteration to handle the edges.

Some conversational engineers already do a version of this. They ask agents to write specs, send them to a stronger model for review, paste back improvements. But they treat this as optional. In the spec-driven model, it is the core workflow.


What Changes When Specs Are Primary

Dimension Conversational Spec-driven
Scaling to teams Hard (intuition doesn’t transfer) Natural (specs are shareable artifacts)
Multi-agent coordination Manual (separate terminals) Structured (tasks from specs)
Correctness guarantees Probabilistic (hope + iterate) Deterministic (verify against spec)
Refactoring cost 20% ongoing tax Front-loaded, then minimal
Knowledge retention In the engineer’s head In the spec + test suite
Onboarding new agents Re-explain everything Read the spec

The deepest difference: specs survive context window resets. Intuition does not.

When you start a new agent session, a spec gives it full alignment in seconds. Without a spec, the agent has to re-discover your intent from code archaeology. This is why conversational agents read so many files before starting. They are reconstructing the spec that was never written.


The Second Wave

Camp 1 represents the first wave of agentic engineering: “AI writes code.”

The second wave is: “AI proves systems correct.”

The first wave treats agents as fast typists. The second wave treats agents as compilers that take specifications and produce verified implementations.

This is where the real 10x leap happens. Not from faster iteration, but from eliminating entire categories of bugs by construction.

First wave:  implement → test → fix → repeat
Second wave: specify → generate → verify → done

The shift from “correctness through iteration” to “correctness by construction” is the same shift that happened from assembly to high-level languages, from manual memory management to garbage collection, from dynamic typing to static typing.

Each time, engineers resisted. Each time, the constraint-based approach won at scale.


Practical Takeaways

  1. Write the spec first. Even a 30-minute PRD provides massive trajectory alignment gains over “just talk to it.”

  2. Tests are constraints, not afterthoughts. Write them before or alongside implementation. They shrink the search space for agents.

    Udemy Bestseller

    Learn Prompt Engineering

    My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

    4.5/5 rating
    306,000+ learners
    View Course
  3. Use conversation for exploration. Dialogue is great for discovering what to build. It is bad for encoding what to build.

  4. Own the task layer. Whoever controls the task layer controls what gets built. Specs feed tasks. Tasks feed agents. Agents produce code.

  5. Iterate on specs, not just code. When an agent produces bad output, the fix is usually a better spec, not a better prompt.

  6. Refactoring is a spec failure. If you need 20% refactoring time, your specs were 20% incomplete. Front-load the thinking.


The Meta

Conversational engineering captures elite-level intuition. The workflow is optimised for speed, solo development, and UI-heavy products. Within those constraints, it is hard to beat.

But constraints change. Systems grow. Teams form. Correctness requirements tighten.

When that happens, the engineers who invested in spec-driven workflows will have compounding leverage. Their specs are reusable. Their tests are cumulative. Their agents get more capable every model generation, and better specs means agents extract more value from each capability jump.

The conversational engineers will still be iterating. Faster each year, sure. But linearly.

The compounding path is specs.


Related

Topics
Agentic CodingAi Driven DevelopmentCode GenerationConversational CodeSpec Driven Development

More Insights

Cover Image for Traditional ML vs AI Engineers

Traditional ML vs AI Engineers

The fundamental difference is the **order of operations**.

James Phoenix
James Phoenix
Cover Image for The Harness Is Cheaper Now

The Harness Is Cheaper Now

Building the harness used to be overhead. Now it is cheaper than building the thing and figuring out what is wrong with it.

James Phoenix
James Phoenix