Adaptive Query Expansion for Agent Review

James Phoenix

Summary

Static review checklists miss context-specific issues. Borrowing “query expansion” from information retrieval, a coordinator agent dynamically generates review queries tailored to the artifact under review, then routes each query to a dedicated sub-agent. The expansion itself is the mechanism for exploring the improvement search space. A math module gets “improve rigor” and “find logical flaws.” A user-facing feature gets “find failure modes” and “improve robustness.” The review dimensions adapt to the domain rather than applying a fixed checklist to everything.

The Problem

Existing multi-agent review patterns (actor-critic, swarm) use static review dimensions. The critic checks the same 8 checklists whether it is reviewing a sorting algorithm or an OAuth flow. This creates two failure modes:

Irrelevant checks waste tokens. Accessibility reviews on a CLI utility. Documentation reviews on a throwaway script. The critic spends cycles on dimensions that do not apply.
Missing domain-specific checks. A numerical computing module needs mathematical rigor checks that no generic checklist includes. A payments integration needs idempotency checks that are not in a standard security review. Static dimensions have blind spots shaped by whoever wrote the checklist.

The root cause: review dimensions are authored at design time, not at review time.

The Solution

Insert a query expansion step between the task and the review swarm. An LLM examines the artifact and generates the most relevant review queries for that specific context. Each expanded query becomes a sub-agent’s mission.

┌──────────────────────────┐
│   Artifact Under Review  │
└────────────┬─────────────┘
             │
             ▼
┌──────────────────────────┐
│   EXPANSION AGENT        │
│   "What should we check  │
│    for THIS artifact?"   │
└────────────┬─────────────┘
             │
             ▼
   ┌─────────┼─────────┐
   │         │         │
   ▼         ▼         ▼
┌──────┐ ┌──────┐ ┌──────┐
│Query │ │Query │ │Query │
│Agent │ │Agent │ │Agent │
│  1   │ │  2   │ │  N   │
└──┬───┘ └──┬───┘ └──┬───┘
   │        │        │
   └────────┼────────┘
            │
            ▼
   ┌────────────────┐
   │   AGGREGATOR   │
   │   De-dupe,     │
   │   prioritize   │
   └────────────────┘

The expansion agent does not review anything itself. It only generates the queries. This separation means the expansion can be cheap (Haiku) while the actual review sub-agents can be thorough (Sonnet/Opus).

Why This Works: Search Space Exploration

A fixed checklist is a fixed set of coordinates in the quality search space. Query expansion lets an LLM select coordinates dynamically based on the input. This is the same principle as query expansion in information retrieval: a user searches “auth bug,” the system expands to [“authentication failure,” “login error,” “session timeout,” “token expiry”]. Each expansion probes a different region of the search space.

For code review, the “search space” is the space of all possible quality issues. Static checklists sample this space at fixed points. Adaptive expansion samples it at points the LLM predicts are most relevant. The coverage per token spent is higher because you do not waste tokens on irrelevant dimensions.

Domain-Adaptive Expansion Examples

Mathematical code:

“Verify the mathematical correctness and numerical stability”
“Find edge cases where floating point precision causes incorrect results”
“Check that the algorithm matches the paper/specification it implements”

User-facing feature:

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated

Claude Code + agentic systems

View Book

“Find failure modes a real user would trigger”
“Check error messages are actionable, not generic”
“Verify the feature degrades gracefully under load”

Payment integration:

“Verify idempotency of all payment operations”
“Check for race conditions in concurrent payment flows”
“Audit that webhook verification prevents replay attacks”

Data pipeline:

“Find scenarios where partial failures leave data in an inconsistent state”
“Check that backpressure is handled when downstream is slow”
“Verify exactly-once semantics or document at-least-once guarantees”

The point: no static checklist would generate all of these. The expansion agent tailors the queries to what matters.

Implementation with the Agent SDK

Step 1: Define the Expansion Schema

Use structured output to get well-formed queries from the expansion step.

import { z } from "zod";

const ReviewQuery = z.object({
  id: z.string(),
  query: z.string().describe("The specific review question for the sub-agent"),
  rationale: z.string().describe("Why this check matters for this artifact"),
  tools: z.array(z.string()).describe("Tools the reviewer needs"),
});

const ExpandedQueries = z.object({
  artifact_summary: z.string(),
  queries: z.array(ReviewQuery).min(2).max(6),
});

type ExpandedQueries = z.infer<typeof ExpandedQueries>;

Step 2: Run the Expansion Agent

A cheap, fast call that reads the code and generates the review queries.

import { query } from "@anthropic-ai/claude-agent-sdk";

async function expandReviewQueries(filePath: string): Promise<ExpandedQueries> {
  const schema = z.toJSONSchema(ExpandedQueries);

  for await (const message of query({
    prompt: `Read ${filePath} and generate 3-5 review queries tailored to this specific code.
Each query should target a different dimension of quality that is relevant to what the code actually does.
Do not generate generic checks. Every query must be specific to the domain and patterns in this file.`,
    options: {
      allowedTools: ["Read", "Glob", "Grep"],
      model: "haiku",  // Expansion is cheap
      maxTurns: 5,
      outputFormat: { type: "json_schema", schema },
    },
  })) {
    if (message.type === "result" && message.structured_output) {
      return ExpandedQueries.parse(message.structured_output);
    }
  }

  throw new Error("Expansion agent failed to produce structured output");
}

Step 3: Route Each Query to a Sub-Agent

The orchestrator creates sub-agents dynamically from the expanded queries. Each sub-agent gets a focused mission.

import { query, type AgentDefinition } from "@anthropic-ai/claude-agent-sdk";

async function runAdaptiveReview(filePath: string): Promise<string> {
  // Step 1: Expand
  const expansion = await expandReviewQueries(filePath);

  // Step 2: Build sub-agents from expanded queries
  const agents: Record<string, AgentDefinition> = {};
  for (const rq of expansion.queries) {
    agents[`reviewer-${rq.id}`] = {
      description: rq.query,
      prompt: `You are reviewing code in ${filePath}.

Your specific review mission: ${rq.query}

Why this matters: ${rq.rationale}

Instructions:
- Read the file and analyze it through the lens of your specific mission
- Report concrete issues with file paths and line numbers
- Rate severity: critical, warning, or minor
- If you find no issues for your specific angle, say so explicitly`,
      tools: rq.tools.length > 0 ? rq.tools : ["Read", "Grep", "Glob"],
      model: "sonnet",
    };
  }

  // Step 3: Orchestrate. The parent agent dispatches to sub-agents and aggregates.
  for await (const message of query({
    prompt: `Review ${filePath} using all available reviewer agents in parallel.
Run every reviewer-* agent, then aggregate their findings.
De-duplicate overlapping issues. Prioritize by severity.
Return a single consolidated review.`,
    options: {
      allowedTools: ["Read", "Grep", "Glob", "Task"],
      agents,
      model: "sonnet",
      maxTurns: 30,
    },
  })) {
    if (message.type === "result" && message.subtype === "success") {
      return message.result;
    }
  }

  throw new Error("Review orchestration failed");
}

Full Pipeline

// Usage
const review = await runAdaptiveReview("src/payments/charge.ts");
console.log(review);

// What happens under the hood:
// 1. Haiku reads charge.ts, generates:
//    - "Verify idempotency of the charge operation"
//    - "Check for race conditions between auth and capture"
//    - "Audit error handling for Stripe API failures"
//    - "Find scenarios where partial failures leave inconsistent state"
//
// 2. Four Sonnet sub-agents run in parallel, each focused on one query
//
// 3. Orchestrator de-dupes and returns consolidated findings

Cost Model

Step	Model	Calls	Est. Cost
Expansion	Haiku	1	$0.01-0.03
Review sub-agents	Sonnet	3-5	$0.10-0.40 each
Aggregation	Sonnet	1	$0.05-0.15
Total		5-7	$0.36-1.18

Comparable to a 3-round actor-critic loop, but with higher coverage per dollar because every query is relevant.

When to Use This vs. Static Checklists

Scenario	Recommendation
Known domain, stable codebase	Static checklists. You already know what to check.
Diverse codebase, many domains	Adaptive expansion. Review dimensions vary per file.
Security-critical code	Both. Adaptive expansion plus a mandatory security checklist.
CI pipeline (cost-sensitive)	Static checklists. Expansion adds latency and cost.
Pre-merge deep review	Adaptive expansion. Worth the cost for high-stakes merges.

The Relationship to Existing Patterns

This is not a replacement for actor-critic or swarm patterns. It is a preamble to them. You can chain expansion into any existing review workflow:

Expansion then actor-critic: Expansion generates the critique dimensions. The critic uses those dimensions instead of hardcoded ones.
Expansion then swarm: Expansion generates N queries. Each becomes a swarm agent.
Expansion then pipeline: Expansion generates ordered review stages. Each stage in the pipeline addresses one expanded query.

The expansion step is composable because it only produces queries. It does not execute them.

Agent Swarm Patterns – Static multi-perspective review that this pattern makes adaptive
Actor-Critic Adversarial Coding – Fixed 8-dimension critique that expansion can replace
Orchestration Patterns – Coordinator and Swarm patterns used for routing expanded queries
Sub-Agent Architecture – How to structure the specialized review agents
Monte Carlo QA – Brute-force quality through repetition. Expansion is the targeted alternative.
Upfront Questioning – Narrowing solution space before implementation. Expansion narrows review space before reviewing.