12 Factor Agents

James Phoenix

Principles for building production-ready LLM-powered software.

Core Premise

Successful AI agents aren’t pure “loop until solved” systems. They integrate language models into broader deterministic software architectures, using agents for well-scoped tasks within larger workflows. The goal is always customer value.

“Agents get lost when context windows grow too long, spinning endlessly through repeated failures.”

Solution: Embrace micro-agents handling 5-20 step workflows with human-in-the-loop approval gates.

Historical Context

60 years ago: Programs as directed graphs (DAGs)

20 years ago: DAG orchestrators (Airflow, Prefect, Dagster)

10-15 years ago: DAGs with embedded ML models

Today: Agents as micro-optimized decision points within deterministic workflows

The Agent Loop Problem

Traditional agent loops can spin indefinitely

Solution: Micro-agents within deterministic DAGs

The 12 Factors

Factor 1: Natural Language to Tool Calls

12 Factor Agents

Convert user requests into structured JSON that triggers deterministic code. The LLM decides what to do; your code controls how it executes.

// User: "create a payment link for $750"
// LLM outputs structured tool call:
{
  "tool": "create_payment_link",
  "parameters": {
    "amount": 750,
    "currency": "USD"
  }
}

// Your code handles execution deterministically
async function executeToolCall(toolCall: ToolCall) {
  switch (toolCall.tool) {
    case "create_payment_link":
      return await stripe.paymentLinks.create({
        line_items: [{
          price_data: {
            currency: toolCall.parameters.currency,
            unit_amount: toolCall.parameters.amount * 100,
          },
          quantity: 1,
        }],
      });
  }
}

Factor 2: Own Your Prompts

12 Factor Agents

Resist black-box framework abstractions. Treat prompts as first-class code.

// Bad: Hidden in framework
const agent = new MagicAgent({ task: "deployment" });

// Good: Explicit, testable prompts
const DEPLOYMENT_PROMPT = `
You are a deployment assistant. You have access to the following tools:
- deploy_to_staging: Deploy the current branch to staging
- run_tests: Execute the test suite
- deploy_to_production: Deploy to production (requires approval)

Current context:
- Branch: {{branch}}
- Last commit: {{commit}}
- Test status: {{testStatus}}

Respond with the next action to take.
`;

function buildPrompt(context: DeploymentContext): string {
  return DEPLOYMENT_PROMPT
    .replace("{{branch}}", context.branch)
    .replace("{{commit}}", context.lastCommit)
    .replace("{{testStatus}}", context.testStatus);
}

Factor 3: Own Your Context Window

Context engineering matters more than tweaking model parameters. Design custom context formats optimized for your domain.

// Structure context for token efficiency and comprehension
function buildContext(events: Event[]): string {
  return `
<system_state>
  <current_step>3 of 5</current_step>
  <status>awaiting_approval</status>
</system_state>

<event_history>
${events.map(e => `  <event type="${e.type}" ts="${e.timestamp}">${e.summary}</event>`).join('\n')}
</event_history>

<available_actions>
  - approve_deployment
  - reject_deployment
  - request_more_info
</available_actions>
`;
}

Factor 4: Tools Are Just Structured Outputs

12 Factor Agents

Tools are JSON outputs representing the next action. This separation allows flexibility—the same output can trigger different implementations.

// Tool definition - what the LLM sees
const tools = [
  {
    name: "send_notification",
    description: "Send a notification to the user",
    parameters: {
      channel: { type: "string", enum: ["slack", "email", "sms"] },
      message: { type: "string" }
    }
  }
];

// Tool execution - what actually happens
function executeTool(toolCall: ToolCall) {
  // Same tool call, different backends based on config
  const channel = toolCall.parameters.channel;

  switch (channel) {
    case "slack":
      return slackClient.postMessage(toolCall.parameters.message);
    case "email":
      return emailService.send(toolCall.parameters.message);
    case "sms":
      return twilioClient.sendSms(toolCall.parameters.message);
  }
}

Factor 5: Unify Execution State and Business State

12 Factor Agents

Derive execution state from context history. A single thread provides serialization, debugging transparency, and easy resumption.

interface AgentThread {
  id: string;
  events: Event[];
  status: "running" | "paused" | "completed" | "failed";
}

// State is derived from events, not stored separately
function deriveState(thread: AgentThread): ExecutionState {
  const lastEvent = thread.events[thread.events.length - 1];

  return {
    currentStep: thread.events.filter(e => e.type === "step_complete").length,
    pendingApprovals: thread.events.filter(e =>
      e.type === "approval_requested" &&
      !thread.events.find(a => a.type === "approval_granted" && a.requestId === e.id)
    ),
    errors: thread.events.filter(e => e.type === "error"),
    canResume: lastEvent.type !== "completed" && lastEvent.type !== "failed"
  };
}

Factor 6: Launch/Pause/Resume with Simple APIs

12 Factor Agents

Agents need simple interfaces for starting, pausing, and resuming—especially between tool selection and execution.

class Agent {
  async launch(input: string): Promise<AgentThread> {
    const thread = await this.createThread();
    return this.run(thread, input);
  }

  async pause(threadId: string): Promise<void> {
    await this.db.updateThread(threadId, { status: "paused" });
  }

  async resume(threadId: string, feedback?: string): Promise<AgentThread> {
    const thread = await this.db.getThread(threadId);
    if (feedback) {
      thread.events.push({ type: "human_feedback", content: feedback });
    }
    return this.run(thread);
  }
}

// Webhook endpoint for external triggers
app.post("/webhook/resume/:threadId", async (req, res) => {
  const { threadId } = req.params;
  const { feedback } = req.body;
  await agent.resume(threadId, feedback);
  res.json({ status: "resumed" });
});

Factor 7: Contact Humans with Tool Calls

12 Factor Agents

Treat human contact as structured tool calls, not plaintext. This enables multi-channel communication and auditable workflows.

const humanTools = [
  {
    name: "request_human_approval",
    description: "Request approval from a human before proceeding",
    parameters: {
      action: { type: "string", description: "What action needs approval" },
      context: { type: "string", description: "Relevant context for decision" },
      urgency: { type: "string", enum: ["low", "medium", "high"] },
      channel: { type: "string", enum: ["slack", "email"] }
    }
  },
  {
    name: "request_human_input",
    description: "Ask a human for information or clarification",
    parameters: {
      question: { type: "string" },
      options: { type: "array", items: { type: "string" }, optional: true }
    }
  }
];

// Execution pauses and waits for human response
async function executeHumanTool(toolCall: ToolCall, thread: AgentThread) {
  await notifyHuman(toolCall);
  thread.status = "paused";
  thread.events.push({
    type: "awaiting_human",
    toolCall,
    timestamp: Date.now()
  });
  return { status: "paused", awaiting: "human_response" };
}

Factor 8: Own Your Control Flow

12 Factor Agents

Build custom loops and branching logic. Different tool types warrant different handling.

async function agentLoop(thread: AgentThread): Promise<AgentThread> {
  while (thread.status === "running") {
    const toolCall = await llm.getNextAction(thread);

    switch (classifyTool(toolCall)) {
      case "immediate":
        // Data fetching - execute and continue
        const result = await executeTool(toolCall);
        thread.events.push({ type: "tool_result", toolCall, result });
        break;

      case "requires_approval":
        // Human approval - pause the loop
        await requestApproval(toolCall);
        thread.status = "paused";
        return thread;

      case "terminal":
        // Completion - end the loop
        thread.status = "completed";
        return thread;

      case "error":
        // Error handling - may retry or escalate
        if (thread.consecutiveErrors >= 3) {
          await escalateToHuman(thread);
          thread.status = "paused";
          return thread;
        }
        thread.consecutiveErrors++;
        break;
    }
  }
  return thread;
}

Factor 9: Compact Errors into Context Window

12 Factor Agents

Feed error messages back into context for self-healing. Implement thresholds to prevent spin-outs.

async function handleError(error: Error, thread: AgentThread): Promise<void> {
  thread.events.push({
    type: "error",
    message: error.message,
    stack: error.stack?.slice(0, 500), // Truncate for context efficiency
    timestamp: Date.now()
  });

  thread.consecutiveErrors++;

  if (thread.consecutiveErrors >= 3) {
    // Escalate to human rather than spinning
    await requestHumanHelp(thread, {
      reason: "consecutive_errors",
      errors: thread.events.filter(e => e.type === "error").slice(-3)
    });
    thread.status = "paused";
  }
}

// Context includes recent errors for LLM to learn from
function buildErrorContext(thread: AgentThread): string {
  const recentErrors = thread.events
    .filter(e => e.type === "error")
    .slice(-3);

  if (recentErrors.length === 0) return "";

  return `
<recent_errors>
${recentErrors.map(e => `  <error ts="${e.timestamp}">${e.message}</error>`).join('\n')}
</recent_errors>

Note: You have encountered ${recentErrors.length} recent errors.
Please try a different approach or request human assistance.
`;
}

Factor 10: Small, Focused Agents

12 Factor Agents

Scope agents to 3-20 steps maximum. As context grows, LLM performance degrades. This aligns with Liquidation Cadence: ship focused agents that deliver real value rather than sprawling systems that never complete.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

★ 4.5/5 rating

306,000+ learners

View Course

// Bad: Monolithic agent
const megaAgent = new Agent({
  capabilities: ["deploy", "test", "monitor", "rollback", "notify", "audit"]
});

// Good: Focused agents composed in a DAG
const deployAgent = new Agent({ capabilities: ["deploy_staging", "deploy_prod"] });
const testAgent = new Agent({ capabilities: ["run_tests", "analyze_results"] });
const notifyAgent = new Agent({ capabilities: ["slack", "email", "pagerduty"] });

// Deterministic orchestration
async function deploymentWorkflow(pr: PullRequest) {
  // Step 1: Deploy to staging (deterministic)
  await deployToStaging(pr);

  // Step 2: Run tests (agent decides which tests)
  const testPlan = await testAgent.planTests(pr);
  const results = await runTests(testPlan);

  // Step 3: If tests pass, agent decides notification
  if (results.passed) {
    await deployAgent.requestProdApproval(pr);
  } else {
    await notifyAgent.alertFailure(results);
  }
}

Factor 11: Trigger from Anywhere

12 Factor Agents

Enable agents to launch from events, crons, webhooks, or user actions.

// Webhook trigger
app.post("/webhook/github", async (req, res) => {
  if (req.body.action === "closed" && req.body.pull_request.merged) {
    await deployAgent.launch({ pr: req.body.pull_request });
  }
});

// Cron trigger
cron.schedule("0 9 * * *", async () => {
  await reportAgent.launch({ type: "daily_summary" });
});

// Slack trigger
slack.command("/deploy", async ({ command, ack }) => {
  await ack();
  await deployAgent.launch({
    branch: command.text,
    requestedBy: command.user_id
  });
});

// Event-driven trigger
eventBus.on("error_spike_detected", async (event) => {
  await incidentAgent.launch({
    alert: event,
    channel: "pagerduty"
  });
});

Factor 12: Make Your Agent a Stateless Reducer

12 Factor Agents

Treat agents as functions transforming input state into output state.

// Agent as a pure function
type AgentReducer = (state: AgentState, event: Event) => AgentState;

const agentReducer: AgentReducer = (state, event) => {
  switch (event.type) {
    case "user_input":
      return { ...state, pendingInput: event.content };

    case "tool_call":
      return { ...state, lastToolCall: event.toolCall };

    case "tool_result":
      return {
        ...state,
        context: [...state.context, event],
        lastToolCall: null
      };

    case "error":
      return {
        ...state,
        errors: [...state.errors, event],
        consecutiveErrors: state.consecutiveErrors + 1
      };

    case "human_response":
      return {
        ...state,
        context: [...state.context, event],
        status: "running"
      };

    default:
      return state;
  }
};

// Replay any state by reducing over events
function replayState(events: Event[]): AgentState {
  return events.reduce(agentReducer, initialState);
}

DeployBot Example

Key Takeaway

“I don’t know what’s the best approach, but you want flexibility to try everything.”

Own your stack to experiment freely with prompts, context structures, control flow, and error handling strategies.

Context-Efficient Backpressure – Factor 9: Compact Errors
Writing a Good CLAUDE.md – Factor 2: Own Your Prompts
FP Increases LLM Signal – Factor 4: Tools Are Just Structured Outputs
Agent Capabilities – Expanding agent tool access
Systems Thinking

Business Context

These architectural principles serve a higher purpose: delivering real value to customers. Well-designed agent systems should optimize for measurable outcomes, not technical elegance for its own sake.

Value Creation – Agent systems must optimize for customer value
Liquidation Cadence – Ship agents that deliver real value

12 Factor Agents

Core Premise

Historical Context

The Agent Loop Problem

The 12 Factors

Factor 1: Natural Language to Tool Calls

Factor 2: Own Your Prompts

Factor 3: Own Your Context Window

Factor 4: Tools Are Just Structured Outputs

Factor 5: Unify Execution State and Business State

Factor 6: Launch/Pause/Resume with Simple APIs

Factor 7: Contact Humans with Tool Calls

Factor 8: Own Your Control Flow

Factor 9: Compact Errors into Context Window

Factor 10: Small, Focused Agents

Learn Prompt Engineering

Factor 11: Trigger from Anywhere

Factor 12: Make Your Agent a Stateless Reducer

DeployBot Example

Key Takeaway

Related

Business Context

More Insights

LLM VCR and Agent Trace Hierarchy: Deterministic Replay for Agent Pipelines

Agent Search Observation Loop: Learning What Context to Provide