Knowledge & failure modes

Attention budget

The attention budget is the idea that a model's effective attention is a finite resource spread across the whole context window. The more you put in, the thinner the attention on each piece.

James Phoenix
Understanding Data Updated July 2, 2026

An attention budget is a way of thinking about attention as a finite resource. The model's ability to focus is spread across everything in the context at once, so the more you put in, the thinner that focus gets on any single piece.

Finite focus

The metaphor is not literal accounting, but it captures something real. Cramming a giant context window full does not give you a model that pays full attention to all of it. It gives you attention divided across more material. Fill the window with ten files when the task needs two, and the model is now splitting its focus eight ways that do not help, at the expense of the two that do.

Where the budget gets wasted:

  • Files, logs, or docs that are not relevant to the current task.
  • Finished conversation history nobody needs anymore.
  • Long, repetitive tool output that could be trimmed to the point.

Spend it deliberately

This is why bigger is not automatically better. A model with plenty of room left in its window can still perform worse than one handed a tight, relevant slice, because a bloated context starves the budget and slides toward attention degradation.

Tip
Treat context space as attention you are spending, not just tokens you are paying for. Before adding something, ask whether it earns its place. Clearing finished work, pointing at the few files that matter, and keeping instructions crisp all protect the budget for the part you actually care about.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help