Lost in the middle

Give a model a long context and it does not attend to all of it evenly. It tends to use what sits at the beginning and the end well, and to overlook what is buried in the middle. The effect was documented in the 2023 paper Lost in the Middle: How Language Models Use Long Contexts, and the name stuck because it describes exactly what you see in practice: the right answer is in the context, but the model reads past it.

Why it bites

It is the reason a bigger context window is not a free win. You can fit a hundred documents in, but if the one that matters lands in the middle of the pile, the model may never really see it. "Just put everything in the prompt" quietly fails, and it fails silently: the answer looks confident and is simply wrong.

What to do about it

Retrieve, do not dump. Pull in the few passages that matter with RAG instead of stuffing the whole corpus in and hoping.
Mind the placement. If you must include a lot, put the most important material at the start or the end, not lost in the middle.
Keep it short. A tighter context has no unattended middle. Less to spread across means stronger focus on what counts.

Watch out

This failure is worst exactly where it is hardest to catch: long documents, big retrieved sets, and sprawling chat histories. If a model ignores a fact you know is in the context, do not assume it cannot read. Assume the fact is buried, and move it.

Why it bites

What to do about it

Related terms

Context engineering

Retrieval-augmented generation (RAG)

Chunking

Engineering context for real systems?