Sub-agents: Accuracy vs Latency Trade-off

James Phoenix
James Phoenix

Sub-agents trade latency for accuracy. Use them when correctness matters more than speed.

Ready to implement? See Full Architecture Guide for implementation patterns.


What Are Sub-agents?

Sub-agents are specialized AI assistants that Claude Code can delegate tasks to. Each sub-agent:

  • Has a specific purpose and expertise area
  • Uses its own context window (separate from main conversation)
  • Can be configured with specific tools
  • Includes a custom system prompt guiding behavior

The Core Trade-off

┌─────────────────────────────────────────────────┐
│                                                 │
│   Accuracy ◄──────────────────────► Latency    │
│                                                 │
│   Sub-agent:     ████████████░░░░   High/High  │
│   Main agent:    ██████░░░░░░░░░░   Med/Low    │
│   Script:        ████░░░░░░░░░░░░   Low/None   │
│                                                 │
└─────────────────────────────────────────────────┘
Approach Accuracy Latency Token Cost
Sub-agent High (fresh context, specialized) High (cold start, gathering context) Higher
Main agent Medium (context pollution) Low (already running) Medium
Script Fixed (deterministic) None Zero

Why Sub-agents Are More Accurate

1. Fresh Context Window

Main conversation accumulates noise. Sub-agents start clean.

Main conversation (50k tokens):
- Previous debugging session
- Unrelated file reads
- Abandoned approaches
- Old error messages
        ↓
    Context pollution = degraded performance
Sub-agent context (5k tokens):
- Just the task description
- Only relevant files
- Focused system prompt
        ↓
    Clean context = better reasoning

2. Specialized System Prompts

Sub-agents can have detailed, task-specific instructions:

---
name: security-reviewer
description: Security audit specialist. Use proactively after code changes.
tools: Read, Grep, Glob, Bash
---

You are a security expert reviewing code for vulnerabilities.

Focus on:
- OWASP Top 10 vulnerabilities
- Input validation gaps
- Authentication/authorization flaws
- Secrets exposure
- Injection risks

For each finding:
1. Severity (Critical/High/Medium/Low)
2. File and line number
3. Specific vulnerability type
4. Exploitation scenario
5. Remediation code

A main agent can’t hold this level of specialization for every domain.

3. Tool Restriction

Limiting tools focuses the agent:

# Code reviewer - read only
tools: Read, Grep, Glob, Bash

# Fixer - can edit
tools: Read, Edit, Bash, Grep, Glob

# Deployer - specific access
tools: Bash

Fewer tools = less decision paralysis = better execution.


When Sub-agents Win

High-Stakes Decisions

Security review before production deploy
        ↓
    Use sub-agent (accuracy > speed)

Complex Analysis

Analyze entire codebase for performance issues
        ↓
    Use sub-agent (clean context for large scope)

Specialized Domains

Database query optimization
        ↓
    Use specialized sub-agent with SQL expertise

When Main Agent Wins

Quick Iterations

Fix this typo
        ↓
    Main agent (latency > accuracy overkill)

Context Already Loaded

Continue the refactor we started
        ↓
    Main agent (has the context)

Simple Tasks

Run the tests
        ↓
    Main agent or script (no specialization needed)

The Latency Cost

Sub-agents add latency because they:

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated
Claude Code + agentic systems
View Book
  1. Cold start – Initialize new context
  2. Gather context – Re-read files main agent already knows
  3. Build understanding – Can’t leverage prior conversation
Main agent task: 5 seconds (context ready)
Sub-agent task: 15-30 seconds (must gather context)

Mitigation: Resumable sub-agents can continue previous conversations:

> Resume agent abc123 and continue the analysis

Built-in Sub-agents

Claude Code includes these out of the box:

Explore (Fast, Read-only)

Model: Haiku (fast)
Tools: Glob, Grep, Read, Bash (read-only)
Purpose: Quick codebase exploration

General-purpose (Capable, Full Access)

Model: Sonnet
Tools: All
Purpose: Complex multi-step tasks

Plan (Research for Planning)

Model: Sonnet
Tools: Read, Glob, Grep, Bash
Purpose: Gather context for planning

Creating Custom Sub-agents

File Location

.claude/agents/         # Project-level (highest priority)
~/.claude/agents/       # User-level (lower priority)

Template

---
name: your-agent-name
description: When to use this agent. Use proactively for X.
tools: Tool1, Tool2, Tool3
model: sonnet  # or haiku, opus, inherit
---

You are an expert in [domain].

When invoked:
1. First step
2. Second step
3. Third step

Focus on:
- Key consideration 1
- Key consideration 2

Output format:
- Findings with file:line references
- Severity ratings
- Specific recommendations

Example: Security + Performance Swarm

Combine multiple specialized sub-agents:

> Run security-reviewer on src/auth/
> Run performance-analyzer on src/api/
> Run test-coverage-checker on src/

Aggregate findings, prioritize by severity.

Each sub-agent:

  • Has specialized expertise
  • Works in clean context
  • Returns focused results

Combined: higher accuracy than one generalist agent.


The Decision Framework

Is the task simple and context is fresh?
    YES → Main agent
    NO  ↓

Is it a repeated workflow?
    YES → Script (see: ad-hoc-to-scripts)
    NO  ↓

Does it need specialized expertise?
    YES → Custom sub-agent
    NO  ↓

Does it need clean context for complex reasoning?
    YES → Sub-agent
    NO  → Main agent

Key Principle

Sub-agents are for accuracy. Main agent is for speed. Scripts are for repetition.

Choose based on what matters most for the task at hand.


Related

Topics
Ai SpecializationClaude IntegrationContext WindowsLatency Vs AccuracySub Agents

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for The Semantic Triangle: Mock Screens, PoC Backend, and Spec File Beat Any One Alone

The Semantic Triangle: Mock Screens, PoC Backend, and Spec File Beat Any One Alone

Three artefacts. Three reduced ambiguities. One projection task instead of three inventions.

James Phoenix
James Phoenix
Cover Image for Contracts Parallelize Agents

Contracts Parallelize Agents

If you’re waiting for Agent A to finish before starting Agent B, you’re wasting time. Define the contract between them and dispatch both now.

James Phoenix
James Phoenix