12 Factor Agents

James Phoenix
James Phoenix

Principles for building production-ready LLM-powered software.


Core Premise

Successful AI agents aren’t pure “loop until solved” systems. They integrate language models into broader deterministic software architectures, using agents for well-scoped tasks within larger workflows. The goal is always customer value.

“Agents get lost when context windows grow too long, spinning endlessly through repeated failures.”

Solution: Embrace micro-agents handling 5-20 step workflows with human-in-the-loop approval gates.


Historical Context

60 years ago: Programs as directed graphs (DAGs)
60 years ago: Programs as directed graphs (DAGs)
20 years ago: DAG orchestrators (Airflow, Prefect, Dagster)
20 years ago: DAG orchestrators (Airflow, Prefect, Dagster)
10-15 years ago: DAGs with embedded ML models
10-15 years ago: DAGs with embedded ML models
Today: Agents as micro-optimized decision points within deterministic workflows
Today: Agents as micro-optimized decision points within deterministic workflows

The Agent Loop Problem

Traditional agent loops can spin indefinitely
Traditional agent loops can spin indefinitely
Solution: Micro-agents within deterministic DAGs
Solution: Micro-agents within deterministic DAGs

The 12 Factors

Factor 1: Natural Language to Tool Calls

12 Factor Agents

Convert user requests into structured JSON that triggers deterministic code. The LLM decides what to do; your code controls how it executes.

// User: "create a payment link for $750"
// LLM outputs structured tool call:
{
  "tool": "create_payment_link",
  "parameters": {
    "amount": 750,
    "currency": "USD"
  }
}

// Your code handles execution deterministically
async function executeToolCall(toolCall: ToolCall) {
  switch (toolCall.tool) {
    case "create_payment_link":
      return await stripe.paymentLinks.create({
        line_items: [{
          price_data: {
            currency: toolCall.parameters.currency,
            unit_amount: toolCall.parameters.amount * 100,
          },
          quantity: 1,
        }],
      });
  }
}

Factor 2: Own Your Prompts

12 Factor Agents

Resist black-box framework abstractions. Treat prompts as first-class code.

// Bad: Hidden in framework
const agent = new MagicAgent({ task: "deployment" });

// Good: Explicit, testable prompts
const DEPLOYMENT_PROMPT = `
You are a deployment assistant. You have access to the following tools:
- deploy_to_staging: Deploy the current branch to staging
- run_tests: Execute the test suite
- deploy_to_production: Deploy to production (requires approval)

Current context:
- Branch: {{branch}}
- Last commit: {{commit}}
- Test status: {{testStatus}}

Respond with the next action to take.
`;

function buildPrompt(context: DeploymentContext): string {
  return DEPLOYMENT_PROMPT
    .replace("{{branch}}", context.branch)
    .replace("{{commit}}", context.lastCommit)
    .replace("{{testStatus}}", context.testStatus);
}

Factor 3: Own Your Context Window

Context engineering matters more than tweaking model parameters. Design custom context formats optimized for your domain.

// Structure context for token efficiency and comprehension
function buildContext(events: Event[]): string {
  return `
<system_state>
  <current_step>3 of 5</current_step>
  <status>awaiting_approval</status>
</system_state>

<event_history>
${events.map(e => `  <event type="${e.type}" ts="${e.timestamp}">${e.summary}</event>`).join('\n')}
</event_history>

<available_actions>
  - approve_deployment
  - reject_deployment
  - request_more_info
</available_actions>
`;
}

Factor 4: Tools Are Just Structured Outputs

12 Factor Agents

Tools are JSON outputs representing the next action. This separation allows flexibility—the same output can trigger different implementations.

// Tool definition - what the LLM sees
const tools = [
  {
    name: "send_notification",
    description: "Send a notification to the user",
    parameters: {
      channel: { type: "string", enum: ["slack", "email", "sms"] },
      message: { type: "string" }
    }
  }
];

// Tool execution - what actually happens
function executeTool(toolCall: ToolCall) {
  // Same tool call, different backends based on config
  const channel = toolCall.parameters.channel;

  switch (channel) {
    case "slack":
      return slackClient.postMessage(toolCall.parameters.message);
    case "email":
      return emailService.send(toolCall.parameters.message);
    case "sms":
      return twilioClient.sendSms(toolCall.parameters.message);
  }
}

Factor 5: Unify Execution State and Business State

12 Factor Agents

Derive execution state from context history. A single thread provides serialization, debugging transparency, and easy resumption.

interface AgentThread {
  id: string;
  events: Event[];
  status: "running" | "paused" | "completed" | "failed";
}

// State is derived from events, not stored separately
function deriveState(thread: AgentThread): ExecutionState {
  const lastEvent = thread.events[thread.events.length - 1];

  return {
    currentStep: thread.events.filter(e => e.type === "step_complete").length,
    pendingApprovals: thread.events.filter(e =>
      e.type === "approval_requested" &&
      !thread.events.find(a => a.type === "approval_granted" && a.requestId === e.id)
    ),
    errors: thread.events.filter(e => e.type === "error"),
    canResume: lastEvent.type !== "completed" && lastEvent.type !== "failed"
  };
}

Factor 6: Launch/Pause/Resume with Simple APIs

12 Factor Agents

Agents need simple interfaces for starting, pausing, and resuming—especially between tool selection and execution.

class Agent {
  async launch(input: string): Promise<AgentThread> {
    const thread = await this.createThread();
    return this.run(thread, input);
  }

  async pause(threadId: string): Promise<void> {
    await this.db.updateThread(threadId, { status: "paused" });
  }

  async resume(threadId: string, feedback?: string): Promise<AgentThread> {
    const thread = await this.db.getThread(threadId);
    if (feedback) {
      thread.events.push({ type: "human_feedback", content: feedback });
    }
    return this.run(thread);
  }
}

// Webhook endpoint for external triggers
app.post("/webhook/resume/:threadId", async (req, res) => {
  const { threadId } = req.params;
  const { feedback } = req.body;
  await agent.resume(threadId, feedback);
  res.json({ status: "resumed" });
});

Factor 7: Contact Humans with Tool Calls

12 Factor Agents

Treat human contact as structured tool calls, not plaintext. This enables multi-channel communication and auditable workflows.

const humanTools = [
  {
    name: "request_human_approval",
    description: "Request approval from a human before proceeding",
    parameters: {
      action: { type: "string", description: "What action needs approval" },
      context: { type: "string", description: "Relevant context for decision" },
      urgency: { type: "string", enum: ["low", "medium", "high"] },
      channel: { type: "string", enum: ["slack", "email"] }
    }
  },
  {
    name: "request_human_input",
    description: "Ask a human for information or clarification",
    parameters: {
      question: { type: "string" },
      options: { type: "array", items: { type: "string" }, optional: true }
    }
  }
];

// Execution pauses and waits for human response
async function executeHumanTool(toolCall: ToolCall, thread: AgentThread) {
  await notifyHuman(toolCall);
  thread.status = "paused";
  thread.events.push({
    type: "awaiting_human",
    toolCall,
    timestamp: Date.now()
  });
  return { status: "paused", awaiting: "human_response" };
}

Factor 8: Own Your Control Flow

12 Factor Agents

Build custom loops and branching logic. Different tool types warrant different handling.

async function agentLoop(thread: AgentThread): Promise<AgentThread> {
  while (thread.status === "running") {
    const toolCall = await llm.getNextAction(thread);

    switch (classifyTool(toolCall)) {
      case "immediate":
        // Data fetching - execute and continue
        const result = await executeTool(toolCall);
        thread.events.push({ type: "tool_result", toolCall, result });
        break;

      case "requires_approval":
        // Human approval - pause the loop
        await requestApproval(toolCall);
        thread.status = "paused";
        return thread;

      case "terminal":
        // Completion - end the loop
        thread.status = "completed";
        return thread;

      case "error":
        // Error handling - may retry or escalate
        if (thread.consecutiveErrors >= 3) {
          await escalateToHuman(thread);
          thread.status = "paused";
          return thread;
        }
        thread.consecutiveErrors++;
        break;
    }
  }
  return thread;
}

Factor 9: Compact Errors into Context Window

12 Factor Agents

Feed error messages back into context for self-healing. Implement thresholds to prevent spin-outs.

async function handleError(error: Error, thread: AgentThread): Promise<void> {
  thread.events.push({
    type: "error",
    message: error.message,
    stack: error.stack?.slice(0, 500), // Truncate for context efficiency
    timestamp: Date.now()
  });

  thread.consecutiveErrors++;

  if (thread.consecutiveErrors >= 3) {
    // Escalate to human rather than spinning
    await requestHumanHelp(thread, {
      reason: "consecutive_errors",
      errors: thread.events.filter(e => e.type === "error").slice(-3)
    });
    thread.status = "paused";
  }
}

// Context includes recent errors for LLM to learn from
function buildErrorContext(thread: AgentThread): string {
  const recentErrors = thread.events
    .filter(e => e.type === "error")
    .slice(-3);

  if (recentErrors.length === 0) return "";

  return `
<recent_errors>
${recentErrors.map(e => `  <error ts="${e.timestamp}">${e.message}</error>`).join('\n')}
</recent_errors>

Note: You have encountered ${recentErrors.length} recent errors.
Please try a different approach or request human assistance.
`;
}

Factor 10: Small, Focused Agents

12 Factor Agents

Scope agents to 3-20 steps maximum. As context grows, LLM performance degrades. This aligns with Liquidation Cadence: ship focused agents that deliver real value rather than sprawling systems that never complete.

Context growth degrades performance
Context growth degrades performance
// Bad: Monolithic agent
const megaAgent = new Agent({
  capabilities: ["deploy", "test", "monitor", "rollback", "notify", "audit"]
});

// Good: Focused agents composed in a DAG
const deployAgent = new Agent({ capabilities: ["deploy_staging", "deploy_prod"] });
const testAgent = new Agent({ capabilities: ["run_tests", "analyze_results"] });
const notifyAgent = new Agent({ capabilities: ["slack", "email", "pagerduty"] });

// Deterministic orchestration
async function deploymentWorkflow(pr: PullRequest) {
  // Step 1: Deploy to staging (deterministic)
  await deployToStaging(pr);

  // Step 2: Run tests (agent decides which tests)
  const testPlan = await testAgent.planTests(pr);
  const results = await runTests(testPlan);

  // Step 3: If tests pass, agent decides notification
  if (results.passed) {
    await deployAgent.requestProdApproval(pr);
  } else {
    await notifyAgent.alertFailure(results);
  }
}

Factor 11: Trigger from Anywhere

12 Factor Agents

Enable agents to launch from events, crons, webhooks, or user actions.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
// Webhook trigger
app.post("/webhook/github", async (req, res) => {
  if (req.body.action === "closed" && req.body.pull_request.merged) {
    await deployAgent.launch({ pr: req.body.pull_request });
  }
});

// Cron trigger
cron.schedule("0 9 * * *", async () => {
  await reportAgent.launch({ type: "daily_summary" });
});

// Slack trigger
slack.command("/deploy", async ({ command, ack }) => {
  await ack();
  await deployAgent.launch({
    branch: command.text,
    requestedBy: command.user_id
  });
});

// Event-driven trigger
eventBus.on("error_spike_detected", async (event) => {
  await incidentAgent.launch({
    alert: event,
    channel: "pagerduty"
  });
});

Factor 12: Make Your Agent a Stateless Reducer

12 Factor Agents

Treat agents as functions transforming input state into output state.

Agent as a fold/reduce operation
Agent as a fold/reduce operation
// Agent as a pure function
type AgentReducer = (state: AgentState, event: Event) => AgentState;

const agentReducer: AgentReducer = (state, event) => {
  switch (event.type) {
    case "user_input":
      return { ...state, pendingInput: event.content };

    case "tool_call":
      return { ...state, lastToolCall: event.toolCall };

    case "tool_result":
      return {
        ...state,
        context: [...state.context, event],
        lastToolCall: null
      };

    case "error":
      return {
        ...state,
        errors: [...state.errors, event],
        consecutiveErrors: state.consecutiveErrors + 1
      };

    case "human_response":
      return {
        ...state,
        context: [...state.context, event],
        status: "running"
      };

    default:
      return state;
  }
};

// Replay any state by reducing over events
function replayState(events: Event[]): AgentState {
  return events.reduce(agentReducer, initialState);
}

DeployBot Example

High-level DeployBot architecture
High-level DeployBot architecture
DeployBot in action
DeployBot in action
Human-agent conversation flow
Human-agent conversation flow

Key Takeaway

“I don’t know what’s the best approach, but you want flexibility to try everything.”

Own your stack to experiment freely with prompts, context structures, control flow, and error handling strategies.


Related


Business Context

These architectural principles serve a higher purpose: delivering real value to customers. Well-designed agent systems should optimize for measurable outcomes, not technical elegance for its own sake.


More Insights

Cover Image for Thought Leaders

Thought Leaders

People to follow for compound engineering, context engineering, and AI agent development.

James Phoenix
James Phoenix
Cover Image for Systems Thinking & Observability

Systems Thinking & Observability

Software should be treated as a measurable dynamical system, not as a collection of features.

James Phoenix
James Phoenix