Self-critique is telling a model to turn on its own work: find the bugs, name the assumptions, argue why the approach is wrong. You can prompt the same session to critique what it just wrote, or spin up a fresh subagent with no attachment to the answer and tell it to attack.
Why it works
Generating and judging are different tasks, and a model is often better at spotting a flaw than at avoiding it in the first place. A pass that only asks "what is wrong with this" regularly surfaces real problems the first draft missed. It is also a direct counter to sycophancy: left alone, a model will bless its own output, so you have to explicitly instruct it to find fault.
The framing decides the result. "Does this look good?" invites a yes. "You are a hostile reviewer, give me three ways this breaks" gets you something useful. A clean agent grills harder than the one that wrote the code, because it is not defending its own draft.
- Run it before you ship, on plans as much as on code.
- Ask for specific failure modes, not a general verdict.
- Prefer a fresh agent so it has no ego in the answer.
Related terms
Sycophancy
Sycophancy is a model's tendency to agree with you and tell you what you want to hear rather than push back. It is why "am I right?" is a leading question that produces a leading answer.
Read definition →Automated review
Automated review is putting a change through an AI reviewer before a person sees it, so a second agent flags likely bugs, missed edge cases, and smells. It catches the obvious cheaply, but it does not replace human judgment about whether the change is right.
Read definition →Human review
Human review is a person actually reading what an agent produced, understanding it, and taking responsibility for shipping it. It is the final quality gate that tests and automated review can support but never replace.
Read definition →