The Actor-Critic Pattern: Writer + Reviewer Agents

James Phoenix
James Phoenix

Summary

The actor-critic pattern separates generation from evaluation. One agent (the actor/writer) produces output, while another (the critic/reviewer) evaluates and improves it. This separation creates higher-quality results than single-pass generation. The pattern applies to code, documentation, design docs, and any content where iterative refinement adds value.

The Core Pattern

┌────────────────────┐
│  ACTOR (Writer)    │
│  - Generates       │
│  - Creates quickly │
│  - Optimistic      │
└────────┬───────────┘
         │
         ▼
    [Draft output]
         │
         ▼
┌────────────────────┐
│  CRITIC (Reviewer) │
│  - Evaluates       │
│  - Finds issues    │
│  - Skeptical       │
└────────┬───────────┘
         │
         ├── Issues? → Actor revises → Critic re-reviews
         │
         └── No issues → Done

The actor focuses on completing the task. The critic focuses on finding problems. Neither role is “more important.” Together they produce better results than either could alone.

Why Separation Works

Single-pass generation has a fundamental conflict: the same mind that creates content cannot objectively evaluate it. The author sees what they intended to write, not what they actually wrote.

Separating roles fixes this:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
  1. The actor writes freely without self-censoring or over-engineering
  2. The critic reviews objectively without ownership bias
  3. Multiple passes catch issues that fresh eyes reveal

This mirrors effective human workflows. Writers benefit from editors. Developers benefit from code review. The pattern encodes this wisdom into AI systems.

Two Implementation Approaches

Approach 1: Single Agent, Alternating Roles

Same LLM switches between actor and critic personas:

// Round 1: Write
const draft = await llm.generate(`
You are a technical writer creating a PRD.
Write a PRD for: ${feature}
`);

// Round 2: Review
const critique = await llm.generate(`
You are a senior product manager reviewing a PRD.
Find gaps, unclear requirements, and missing edge cases.

PRD to review:
${draft}
`);

// Round 3: Revise
const revised = await llm.generate(`
You are improving a PRD based on feedback.

Original PRD:
${draft}

Feedback:
${critique}

Revise to address all issues.
`);

Advantages: Simpler, cheaper (one model), maintains context between rounds.

Disadvantages: Same model biases persist, limited adversarial tension.

Approach 2: Separate Agents

Distinct agent instances with specialized configurations:

const writer = new Agent({
  role: 'technical-writer',
  persona: 'Write clear, concise documentation. Follow style guides.',
  tools: ['Read', 'Glob', 'Write'],
});

const reviewer = new Agent({
  role: 'documentation-reviewer',
  persona: `Review for clarity, accuracy, and completeness.
    Be critical. Find every gap and inconsistency.
    Only approve if genuinely ready for publication.`,
  tools: ['Read', 'Grep'],  // Read-only
});

// Execute loop
let doc = await writer.create(task);
let approved = false;

while (!approved && rounds < maxRounds) {
  const review = await reviewer.evaluate(doc);

  if (review.verdict === 'APPROVED') {
    approved = true;
  } else {
    doc = await writer.revise(doc, review.feedback);
  }
  rounds++;
}

Advantages: True separation of concerns, can use different models (cheaper critic), stronger adversarial tension.

Disadvantages: More complex setup, context not shared automatically.

When to Use Each Approach

Scenario Recommended Approach
Quick drafts, low stakes Single agent
Security-critical code Separate agents
Documentation, PRDs Either works
High-visibility content Separate agents
Tight budget Single agent
Need audit trail Separate agents

Application Beyond Code

The actor-critic pattern is not limited to code review. It applies wherever quality matters:

Documentation

Actor (Writer): Creates technical documentation following style guides.

Critic (Editor): Checks for clarity, accuracy, completeness, and jargon.

## Critique Dimensions for Documentation

1. **Clarity**: Can a newcomer understand this?
2. **Accuracy**: Do code examples work? Are facts correct?
3. **Completeness**: Are edge cases covered? Missing sections?
4. **Consistency**: Does terminology match the codebase?
5. **Actionability**: Can readers follow the instructions?

PRDs and Design Docs

Actor (Author): Writes requirements and design proposals.

Critic (Stakeholder): Challenges assumptions, finds gaps, questions feasibility.

## Critique Dimensions for PRDs

1. **Requirements completeness**: All user needs captured?
2. **Edge cases**: What happens when things go wrong?
3. **Dependencies**: Are external dependencies identified?
4. **Measurability**: How will success be measured?
5. **Feasibility**: Can engineering build this as specified?

Book Chapters

The RALPH loop uses actor-critic for book writing:

Actor (Chapter Writer): Writes first draft following PRD and sources.

Critic (Reviewer Agents): Multiple specialized reviewers check different dimensions.

## Review Agents for Book Chapters

- **slop-checker**: Finds AI-text tells (delve, crucial, moreover)
- **tech-accuracy**: Validates code examples and tool references
- **term-intro-checker**: Ensures acronyms are defined
- **oreilly-style**: Applies publishing conventions

Stopping Criteria

The loop needs exit conditions:

  1. Approval: Critic finds no significant issues
  2. Max rounds: Typically 3-5 rounds (diminishing returns)
  3. Improvement stall: Less than 10% issue reduction between rounds
  4. Escalation: Hand off to human when stuck
function shouldStop(round: number, issues: number, prevIssues: number): boolean {
  if (issues === 0) return true;  // Approved
  if (round >= 5) return true;    // Max rounds

  const improvement = (prevIssues - issues) / prevIssues;
  if (improvement < 0.1) return true;  // Stalled

  return false;
}

Cost vs Quality Trade-offs

More rounds cost more but catch more issues:

Rounds Typical Cost Issues Caught
1 $0.05-0.15 0% (no review)
2 $0.10-0.30 60-70%
3 $0.15-0.45 80-85%
4 $0.20-0.60 90-95%
5 $0.25-0.75 95%+

For high-stakes content, 3-5 rounds are worth the cost. For drafts and internal docs, 1-2 rounds suffice.

Common Pitfalls

1. Critic Too Harsh

Problem: Critic never approves, creates infinite loops.

Fix: Set maximum rounds. Accept “good enough.” Have critic distinguish critical vs minor issues.

2. Critic Too Lenient

Problem: Critic approves everything, provides no value.

Fix: Use adversarial prompting. Require specific issue counts. Set quality thresholds.

3. Lost Context Between Rounds

Problem: Actor forgets original requirements when revising.

Fix: Include original task in every revision prompt. Use session-based agents.

4. Over-Engineering

Problem: Each round adds complexity without improving quality.

Fix: Critic should enforce simplicity. “Only fix real issues, don’t add features.”

Integration with RALPH

The RALPH loop uses actor-critic at multiple levels:

  1. Micro-level: Within each task, code/content is generated then reviewed
  2. Macro-level: Every 6 iterations, review agents run across all output
  3. Meta-level: Progress summarizer evaluates overall quality metrics

This multi-scale application of actor-critic creates compounding quality.

Related

Topics
Actor CriticContent GenerationIterative ImprovementQuality LoopsReviewer AgentSub AgentsWriter Agent

More Insights

Cover Image for Own Your Control Plane

Own Your Control Plane

If you use someone else’s task manager, you inherit all of their abstractions. In a world where LLMs make software a solved problem, the cost of ownership has flipped.

James Phoenix
James Phoenix
Cover Image for Indexed PRD and Design Doc Strategy

Indexed PRD and Design Doc Strategy

A documentation-driven development pattern where a single `index.md` links all PRDs and design documents, creating navigable context for both humans and AI agents.

James Phoenix
James Phoenix