Agent Swarm Patterns for Thoroughness

James Phoenix

When you really want to make sure, run multiple sub-agents multiple times. Aggregate, de-dupe, then plan.

The Core Idea

Single agent runs have blind spots. Multiple agents from multiple perspectives catch what individuals miss.

10 agents × 4 runs = 40 perspectives → De-dupe → Solid plan

Example prompt:

“Launch at least 10 sub-agents to find performance issues, critical errors, or bugs in the pub/sub and rate limiting code. Aggregate the data, de-dupe, then make a SOLID plan.”

Two Modes of Work

Investigative Mode

Find problems in existing code.

recurse(
  human(decide what to investigate)
  → agents(investigate)
  → find gaps/issues/refactors
  → fix gaps
  → run tests
  → fix tests
)

Forward Planning Mode

Build new features with confidence.

recurse(
  human(define new chunk of work)
  → agents(generate spec)
  → human(refine spec)
  → agents(generate code)
  → run tests
  → confirm working
  → de-dupe/consolidate
  → DIGEST.md
)

Perspective Multiplication Patterns

Pattern 1: Many Perspectives

Generate different perspectives, collapse into optimal plan.

Agent 1: Security perspective
Agent 2: Performance perspective
Agent 3: Maintainability perspective
Agent 4: Edge cases perspective
Agent 5: Integration perspective
         ↓
    Aggregate
         ↓
   De-duplicate
         ↓
    Final Plan

Prompt:

Launch 5 sub-agents to review this code, each with a different focus:
1. Security vulnerabilities
2. Performance bottlenecks
3. Code maintainability
4. Edge cases and error handling
5. Integration points and contracts

Aggregate findings, remove duplicates, prioritize by severity.

Pattern 2: Same Perspective Multiple Times

Run the same analysis multiple times to catch probabilistic misses.

Agent 1: Find bugs (run 1)
Agent 2: Find bugs (run 2)
Agent 3: Find bugs (run 3)
Agent 4: Find bugs (run 4)
         ↓
    Aggregate
         ↓
   De-duplicate
         ↓
  Higher confidence

Why this works: LLMs are probabilistic. Run the same prompt 4 times, you’ll get different findings each time. The union catches more than any single run.

Pattern 3: Many-Many Perspectives

Maximum thoroughness: multiple perspectives × multiple runs.

10 agents × 4 runs = 40 total analyses
         ↓
    Aggregate all
         ↓
   De-duplicate
         ↓
    Final Plan

Prompt:

I want maximum confidence on this analysis.

Run 10 different sub-agents (security, performance, types, tests,
edge cases, race conditions, memory leaks, API contracts, error
handling, logging/observability).

Run each perspective 4 times.

Aggregate all 40 analyses, de-duplicate findings, rank by:
1. Severity (critical > high > medium > low)
2. Confidence (found by multiple runs > single run)
3. Effort to fix

Output a prioritized action plan.

Specialized Analysis Patterns

Spec Drift Detection

Check code vs design vs tests and spot mismatches.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

★ 4.5/5 rating

306,000+ learners

View Course

Agent 1: Read the spec/design docs
Agent 2: Analyze the implementation
Agent 3: Analyze the tests
         ↓
    Compare all three
         ↓
   Find mismatches

Prompt:

Launch 3 sub-agents:
1. Extract intended behavior from specs in `/docs`
2. Extract actual behavior from implementation in `/src`
3. Extract tested behavior from tests in `/tests`

Compare the three. Report:
- Spec says X but code does Y
- Code does X but no test covers it
- Test expects X but spec says Y

Invariant Extraction

Have agents tell you what MUST always be true.

Agents analyze code
         ↓
   Extract invariants
         ↓
  "These must ALWAYS be true"
         ↓
   Generate assertions/tests

Prompt:

Analyze this codebase and extract invariants—things that must
ALWAYS be true for the system to be correct.

Examples:
- "User balance must never be negative"
- "Every request must have a trace ID"
- "Cache TTL must be less than DB TTL"

For each invariant:
1. State the invariant
2. Where it should be enforced
3. Draft the assertion or test

Swarm Types for Read Operations

Individual Swarms

Multiple agents working independently on the same problem.

┌─────────┐  ┌─────────┐  ┌─────────┐
│ Agent 1 │  │ Agent 2 │  │ Agent 3 │
└────┬────┘  └────┬────┘  └────┬────┘
     │            │            │
     └────────────┼────────────┘
                  ↓
             Aggregate

Use when: You want breadth of coverage, independent analysis.

Competing Swarms

Two agents compete, critique each other, converge on truth.

┌─────────┐        ┌─────────┐
│ Agent A │ ←───→  │ Agent B │
└────┬────┘        └────┬────┘
     │    critique      │
     └────────┬─────────┘
              ↓
         Converged
          analysis

Prompt:

Launch two competing agents:

Agent A: Find all the problems with this code.
Agent B: Review Agent A's findings. Which are valid? Which are
         false positives? What did Agent A miss?

Then Agent A reviews Agent B's critique.

Continue for 2 rounds. Output the agreed-upon findings.

Use when: You want higher confidence through adversarial review.

Dimensions of Analysis

Swarms can operate across different dimensions:

Dimension	What’s Analyzed
Static code	Source files only
Code + tests	Implementation and test coverage
Code + tests + linter	Add linter/type checker output
Code + tests + runtime	Add logs, traces, metrics
Code + spec + tests	Full spec drift detection

Example multi-dimensional prompt:

Analyze the auth module across all dimensions:

1. Static analysis: Read src/auth/*.ts
2. Test coverage: Read tests/auth/*.test.ts, check coverage gaps
3. Type safety: Run `tsc --noEmit`, analyze errors
4. Linter: Run `biome check`, analyze warnings
5. Runtime: Review recent error logs for auth failures

Synthesize findings across all dimensions.

Implementation: Swarm Launcher

# .claude/commands/swarm-analyze.md
I want to run a thorough analysis swarm on the specified code.

Parameters:
- Target: $ARGUMENTS (files or directories to analyze)
- Perspectives: security, performance, types, tests, edge-cases,
  race-conditions, error-handling, observability
- Runs per perspective: 2

Process:
1. Launch sub-agents for each perspective
2. Each agent analyzes the target independently
3. Aggregate all findings
4. De-duplicate (same issue found by multiple agents = higher confidence)
5. Rank by severity and confidence
6. Output prioritized action plan with specific file:line references

When to Use Swarms

Scenario	Pattern
Pre-deploy safety check	Many perspectives × 1 run
Critical bug investigation	Same perspective × 4 runs
Major refactor validation	Many-many (10 × 4)
Spec compliance check	Spec drift detection
New codebase onboarding	Invariant extraction
Code review augmentation	Competing swarms

Stacking Claude Code with Agent SDK

The most powerful pattern: programmatically launch multiple Claude Code instances.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Orchestrator (Agent SDK)                 │
│                                                             │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │
│  │ Claude   │  │ Claude   │  │ Claude   │  │ Claude   │   │
│  │ Code #1  │  │ Code #2  │  │ Code #3  │  │ Code #4  │   │
│  │          │  │          │  │          │  │          │   │
│  │ Security │  │ Perf     │  │ Types    │  │ Tests    │   │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘   │
│       │             │             │             │          │
│       └─────────────┴──────┬──────┴─────────────┘          │
│                            ↓                               │
│                    Aggregate + De-dupe                     │
│                            ↓                               │
│                      Final Report                          │
└─────────────────────────────────────────────────────────────┘

Implementation

import { ClaudeAgent } from '@anthropic/agent-sdk';

async function swarmAnalysis(target: string) {
  const perspectives = [
    { name: 'security', prompt: `Find security vulnerabilities in ${target}` },
    { name: 'performance', prompt: `Find performance issues in ${target}` },
    { name: 'types', prompt: `Find type safety issues in ${target}` },
    { name: 'tests', prompt: `Find missing test coverage in ${target}` },
    { name: 'edge-cases', prompt: `Find unhandled edge cases in ${target}` },
  ];

  // Launch 5 Claude Code instances in parallel
  const results = await Promise.all(
    perspectives.map(async (p) => {
      const agent = new ClaudeAgent({
        model: 'claude-sonnet-4-20250514',
        tools: ['Read', 'Grep', 'Glob', 'Bash'],
      });

      const result = await agent.run(p.prompt);
      return { perspective: p.name, findings: result };
    })
  );

  // Aggregate and de-duplicate
  const allFindings = results.flatMap(r => r.findings);
  const deduplicated = deduplicateFindings(allFindings);
  const prioritized = prioritizeBySeverity(deduplicated);

  return prioritized;
}

Multiple Runs per Perspective

For maximum thoroughness, run each perspective multiple times:

async function thoroughSwarm(target: string, runsPerPerspective = 4) {
  const perspectives = ['security', 'performance', 'types', 'tests'];

  // 5 perspectives × 4 runs = 20 parallel agents
  const tasks = perspectives.flatMap(p =>
    Array(runsPerPerspective).fill(null).map((_, i) => ({
      perspective: p,
      run: i + 1,
      prompt: `[Run ${i + 1}] Find ${p} issues in ${target}`,
    }))
  );

  const results = await Promise.all(
    tasks.map(t => launchAgent(t.prompt))
  );

  // Findings found by multiple runs = higher confidence
  const withConfidence = scoreByRedundancy(results);

  return withConfidence;
}

When to Use SDK Stacking

Use Case	Agents	Runs	Total
Quick sanity check	3	1	3
Pre-deploy review	5	2	10
Security audit	10	4	40
Major release	10	4	40

Trade-off: More agents = more cost + latency, but higher confidence.

Key Insight

Single agent = single roll of the dice. Swarm = loaded dice.

LLMs are probabilistic. Any single run might miss something. Multiple runs from multiple angles approach certainty.

The cost is tokens. The benefit is confidence.

Building the Harness – Meta engineering layer
12 Factor Agents – Factor 10: Small, Focused Agents
Learning Loops – What to do with swarm findings
Parallel Agents for Monorepos – Apply swarm patterns to monorepo-wide changes
Actor-Critic Adversarial Coding – Specialized swarm pattern with actor and critic roles
Sub-Agent Architecture – Implementation details for swarm agents
Agent-Native Architecture – Design principles for agent-first systems
Sub-agents: Accuracy vs Latency – Trade-offs when using multiple agents

Agent Swarm Patterns for Thoroughness

The Core Idea

Two Modes of Work

Investigative Mode

Forward Planning Mode

Perspective Multiplication Patterns

Pattern 1: Many Perspectives

Pattern 2: Same Perspective Multiple Times

Pattern 3: Many-Many Perspectives

Specialized Analysis Patterns

Spec Drift Detection

Learn Prompt Engineering

Invariant Extraction

Swarm Types for Read Operations

Individual Swarms

Competing Swarms

Dimensions of Analysis

Implementation: Swarm Launcher

When to Use Swarms

Stacking Claude Code with Agent SDK

Architecture

Implementation

Multiple Runs per Perspective

When to Use SDK Stacking

Key Insight

Related

More Insights

LLM VCR and Agent Trace Hierarchy: Deterministic Replay for Agent Pipelines

Agent Search Observation Loop: Learning What Context to Provide