Agent-Driven Development

James Phoenix
James Phoenix

A workflow where AI agents execute development tasks from structured specs, with humans controlling the task layer rather than writing code directly.

The Three Pillars

You must control:

  1. Tasks – The work queue (what gets done)
  2. Orchestration – The loop (how work flows)
  3. Memory – Context persistence (optional but powerful)

Control the task layer and you can endlessly add work. Spin up 1-3 workers grinding tasks while you focus on specs and review.

The Core Loop

PRD + Design Doc → Worker → Tests
  1. Write PRD (what)
  2. Write design doc (how, interfaces, testing strategy)
  3. Worker generates implementation
  4. Tests validate correctness
  5. Human reviews and iterates

This is “best iteration guess” mode. You front-load thinking into specs, agent does the grind.

Two-Phase Workflow

Phase 1: Spec-Driven Implementation

PRD + Design Doc → Worker → Tests → Review

Get the first working iteration from structured specs.

Phase 2: Agent Swarms for Polish

Agent Swarm on services/<service> → Bug Detection → Fixes

Later, run agent swarms on globs to find bugs, inconsistencies, and cleanup opportunities. Target specific directories:

  • services/<service-name>
  • src/components/
  • lib/

Swarms do the tedious polish work humans skip.

PRD Quality vs Trajectory

Better specs reduce random search. The relationship follows a log curve:

Trajectory Accuracy
        ↑
        │                          day 4+  ← Extremely accurate trajectory
        │                     day 3 ·
        │                day 2 ·
        │           ·
        │       ·
        │    ·
        │  ·
        │·  day 1
        │
        └─────────────────────────────────→ Days spent on PRD + Design Doc
          ↑
     Not accurate
      trajectory
Time Investment Trajectory Alignment
1 day ~60%
3 days ~80%
5 days ~90%

Key insight: The first day of spec work provides massive alignment gains. After that, diminishing returns. But skipping specs entirely means agents wander randomly.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course

Same principle applies to humans: better specs = less wasted iteration. This is real computer science, not an agent limitation.

PRD vs Design Doc Split

PRD (What):

  • Feature requirements
  • User stories
  • Architectural constraints (“Redis for X because of latency, Postgres for Y because of ACID”)
  • Success criteria

Design Doc (How):

  • Interfaces and types
  • Implementation approach
  • Testing strategy
  • File structure

Link everything in an index.md. Your task system can then create tasks directly from the docs.

Why You Must Own the Task Layer

If the entire stack is prompts all the way down, then whoever controls the task layer controls what gets built. The orchestration flow encodes your domain knowledge. If a tool dictates the flow, it’s not a tool. It’s a competitor.

This is why frameworks are the wrong abstraction. You need primitives.

Headless vs Framework

Framework approach: Opinionated flow, batteries included, less flexibility. The framework owns your orchestration. You’re renting.

Headless approach (recommended): Just provide tasks + memory primitives (like TanStack does for state). You own your orchestration. You’re building.

Headless wins because:

  • Every codebase has different needs
  • Orchestration patterns vary by domain
  • You want control, not magic
  • The orchestration IS your product. You can’t outsource it.

tx: The Control Plane

tx is the concrete implementation of this philosophy. Primitives for AI agents, not a framework. Headless infrastructure for memory, tasks, and orchestration.

Core primitives:

Primitive What it does
tx ready Get next workable task
tx claim Lease-based claim so parallel agents don’t collide
tx done Complete a task
tx block Declare dependencies between tasks
tx handoff Transfer work with context
tx context Retrieve relevant learnings/memory

These are the minimal building blocks. You compose them into whatever orchestration your domain needs. No opinions on flow, no magic, just primitives you wire together.

tx-agent-kit is the full-stack starter (Effect, Temporal, Next.js, Drizzle) that shows one way to compose these primitives into a working system.

Getting Started

Burn $100-200/month on a project just to get used to having a worker constantly running. This builds intuition for:

  • How to structure specs for agents
  • What tasks to automate vs. do manually
  • How to review agent output efficiently
  • When to intervene vs. let it run

This is the mainstream future. The skill is learning to direct agents, not compete with them.

It’s Prompts All the Way Down

Everything in agent-driven development reduces to the same primitive: a prompt that produces work. The entire stack is just prompts invoking prompts.

Vision (prompt)
  → PRDs (prompts that define what to build)
    → Design Docs (prompts that define how)
      → Tasks (prompts that agents execute)
        → Sub-tasks (prompts spawned by tasks)
          → Code, tests, docs (output)

There is no magic layer. A PRD is a prompt. A task is a prompt. A design doc is a prompt. An agent’s system message is a prompt. The only difference is scope and audience. Each layer is a prompt that generates the layer below it.

The Task Layer is Self-Recursive

The task layer is special because it can feed itself. A task is just a prompt, and a prompt can say “generate more prompts.”

meta-task (prompt) → 10-50 concrete tasks (prompts)
  ├── task A (prompt → code)
  ├── task B (prompt → code)
  ├── ...
  └── task N (prompt → more prompts → more tasks)

This is not a hack. It’s the natural consequence of the stack being prompts all the way down. Any layer can generate any other layer.

Why This Works

When agents run on a cron or loop, task completion can outpace task creation. The queue drains and the system stalls.

Meta-tasks fix this. They are tasks whose only job is to generate more tasks. The queue feeds itself. “Grow outcomes from outcomes.”

Examples:

  • “Read these 10 PRDs and create implementation tasks for each”
  • “Audit the codebase for missing tests and create a task per gap”
  • “Break this epic into 20 sub-tasks with acceptance criteria”

When to Use

  • Cron-based agent loops where completion rate > creation rate
  • Bootstrapping a new project (e.g. “generate tasks for all 10 PRDs”)
  • Expanding scope without manual intervention (side projects, experiments)

Guardrails

Risk Mitigation
Runaway expansion Cap meta-task depth or ratio (e.g. 1 in 10 generated tasks is a meta-task)
Quality decay Tasks drift from intent the further from root. Anchor meta-tasks to PRDs/specs
Cron bottleneck If hourly cron still can’t keep up, increase frequency or batch size

Best suited for side projects and experiments where full human-in-the-loop isn’t needed. For production systems, keep the human in the loop on task creation.

The Meta

  1. It’s prompts all the way down. PRDs, design docs, tasks, agent configs. Same primitive, different scope.
  2. Control the task layer. That’s where leverage lives.
  3. The task layer is self-recursive. Tasks can create tasks. Prompts can create prompts.
  4. Structure specs in linked PRD + design docs (see index pattern)
  5. Workers grind the implementation
  6. Agent swarms clean up
  7. Humans set vision, review, and handle edge cases

You become the architect. Agents become the workforce. The task queue is the machine.


Related:

Topics
Agent Driven DevelopmentAi AgentsContext EngineeringTask OrchestrationWorkflow Automation

More Insights

Cover Image for Own Your Control Plane

Own Your Control Plane

If you use someone else’s task manager, you inherit all of their abstractions. In a world where LLMs make software a solved problem, the cost of ownership has flipped.

James Phoenix
James Phoenix
Cover Image for Indexed PRD and Design Doc Strategy

Indexed PRD and Design Doc Strategy

A documentation-driven development pattern where a single `index.md` links all PRDs and design documents, creating navigable context for both humans and AI agents.

James Phoenix
James Phoenix