Foundations

Non-determinism

Also called: randomness

Non-determinism is why the same prompt can give you different answers. At inference the model samples among likely next tokens with a controlled amount of randomness, so runs vary.

James Phoenix
Understanding Data Updated July 2, 2026

Send a model the exact same prompt twice and you can get two different answers. That is non-determinism, and it is by design, not a bug. During next-token prediction the model produces a probability distribution over possible next tokens, and then it usually samples from that distribution rather than always taking the single most likely token. A setting called temperature controls how much randomness gets mixed in. Higher temperature, more variety; lower, more repetition.

Why providers do this

A little randomness makes output feel less robotic and helps the model escape repetitive ruts. The trade is reproducibility. Even at very low temperature you are not guaranteed identical results, because inference runs on batched hardware where tiny numerical variations creep in.

What it means for coding work

This is easy to forget until it bites you:

  • A passing run is not proof. An agent solving a task once does not mean it will solve it every time. If reliability matters, run it more than once.
  • Do not hard-code on exact wording. Tests or scripts that assume the model returns a specific string will be flaky. Assert on behaviour or structure instead.
  • Bugs can be intermittent. A prompt that fails one time in five is still broken. Chase the pattern, not the single lucky success.
Watch out
Non-determinism compounds in agents. A different early choice sends the whole run down a different path, so two attempts at the same task can diverge completely. Reproducing an agent failure often takes several tries.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help