Attention budget

An attention budget is a way of thinking about attention as a finite resource. The model's ability to focus is spread across everything in the context at once, so the more you put in, the thinner that focus gets on any single piece.

Finite focus

The metaphor is not literal accounting, but it captures something real. Cramming a giant context window full does not give you a model that pays full attention to all of it. It gives you attention divided across more material. Fill the window with ten files when the task needs two, and the model is now splitting its focus eight ways that do not help, at the expense of the two that do.

Where the budget gets wasted:

Files, logs, or docs that are not relevant to the current task.
Finished conversation history nobody needs anymore.
Long, repetitive tool output that could be trimmed to the point.

Spend it deliberately

This is why bigger is not automatically better. A model with plenty of room left in its window can still perform worse than one handed a tight, relevant slice, because a bloated context starves the budget and slides toward attention degradation.

Tip

Treat context space as attention you are spending, not just tokens you are paying for. Before adding something, ask whether it earns its place. Clearing finished work, pointing at the few files that matter, and keeping instructions crisp all protect the budget for the part you actually care about.

Finite focus

Spend it deliberately

Related terms

Attention

Context window

Attention degradation

Building with AI agents?