Workflow & practice

Self-critique

Also called: adversarial review, grilling

Self-critique is asking a model to attack its own output, or having a fresh agent do it, to catch bugs and bad assumptions before you ship. It is a direct counter to a model's tendency to agree with whatever it just produced.

James Phoenix
Understanding Data Updated July 2, 2026

Self-critique is telling a model to turn on its own work: find the bugs, name the assumptions, argue why the approach is wrong. You can prompt the same session to critique what it just wrote, or spin up a fresh subagent with no attachment to the answer and tell it to attack.

Why it works

Generating and judging are different tasks, and a model is often better at spotting a flaw than at avoiding it in the first place. A pass that only asks "what is wrong with this" regularly surfaces real problems the first draft missed. It is also a direct counter to sycophancy: left alone, a model will bless its own output, so you have to explicitly instruct it to find fault.

The framing decides the result. "Does this look good?" invites a yes. "You are a hostile reviewer, give me three ways this breaks" gets you something useful. A clean agent grills harder than the one that wrote the code, because it is not defending its own draft.

  • Run it before you ship, on plans as much as on code.
  • Ask for specific failure modes, not a general verdict.
  • Prefer a fresh agent so it has no ego in the answer.
Tip
Self-critique is cheap and catches a lot of the obvious, but it does not replace automated review or human review. Treat it as the first filter that clears easy mistakes so the later gates can spend their effort on the hard ones.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help