Knowledge & failure modes

Hallucination

Also called: confabulation

A hallucination is a confident, plausible-sounding output that is simply wrong: an invented API, a fabricated file path, a made-up citation. It is not the model lying. It is the model doing exactly what it always does, predicting plausible text, with no built-in sense of truth.

James Phoenix
Understanding Data Updated July 2, 2026

A hallucination is when a model produces something fluent and confident that turns out to be false. In coding this shows up constantly: a method that does not exist on the library, an import from the wrong package, a config option someone wishes were real, a plausible-looking function signature that is subtly invented.

Why it happens

It follows directly from what a model is. The model predicts the most plausible next tokens; it does not look anything up and it has no internal "is this true" check. Most of the time plausible and correct line up, so it feels reliable. When they diverge, you get an answer that reads perfectly and is wrong. Crucially, the model has no way to tell the difference, which is why hallucinations come with the same confident tone as correct answers.

What actually reduces it

You cannot make a model stop hallucinating, but you can starve the failure mode:

  • Give it ground truth. The single biggest lever. Let the agent read the real file, the real types, the actual docs, so the answer comes from your context instead of the model's memory.
  • Prefer tools over recall. An agent that runs the code or greps the repo beats one guessing from training.
  • Verify, do not trust. Run it, type-check it, click it. Treat every generated API you have not seen before as unverified until proven.
Watch out
Hallucinations are most dangerous exactly where they are hardest to catch: obscure library methods, version-specific behaviour, and anything past the model's knowledge cutoff. If an agent cites something you cannot immediately confirm, assume it needs checking.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help