Lost in the Middle: Preventing Context Window Attention Degradation

James Phoenix

Summary

Large language models exhibit a U-shaped attention pattern: they attend most strongly to information at the beginning and end of their context window, while information in the middle receives reduced attention. This “lost in the middle” effect causes relevant information to be missed even when present in context. Mitigation strategies include positioning critical content at extremes, using progressive disclosure to avoid middle-stuffing, chunking with summaries, and query-anchored context placement.

The Problem

You’ve carefully curated your context. The relevant code is included. The documentation is there. The examples are perfect. Yet the LLM ignores crucial information and generates incorrect output.

The culprit is often not what you included, but where you placed it.

The U-Shaped Attention Curve

Research from “Lost in the Middle: How Language Models Use Long Contexts” (Liu et al., 2023) documented a striking pattern:

“`
Attention Strength
^
| ████ ████
| ████ ████
| ████ ████
| ████ ▓▓▓▓ ▓▓▓▓████
| ████ ▓▓▓▓ ░░░░░░░░░ ▓▓▓▓████
| ████ ▓▓▓▓ ░░░░░░░░░ ▓▓▓▓████
+———————————–> Position
Start Middle End

████ = High attention (80-100%)
▓▓▓▓ = Medium attention (50-80%)
░░░░ = Low attention (20-50%)
“`

Key findings:

Information at positions 1-10% and 90-100% is processed most accurately
Information at positions 40-60% (middle) shows significant accuracy degradation
Effect is more pronounced as context length increases
Models trained on long contexts still exhibit the pattern

Real-World Impact

Scenario: 20K token context with relevant code in middle

“`
Context Structure:
├── System prompt (1K tokens) – Position: 0-5%
├── CLAUDE.md (2K tokens) – Position: 5-15%
├── Type definitions (3K tokens) – Position: 15-30%
├── Irrelevant utility files (5K tokens) – Position: 30-55%
├── THE RELEVANT CODE (2K tokens) – Position: 55-65% ← IN THE DANGER ZONE
├── More utilities (4K tokens) – Position: 65-85%
└── Test examples (3K tokens) – Position: 85-100%

Result: Model may miss the relevant code entirely!
“`

Symptoms of Lost-in-the-Middle

Ignoring provided examples: You included exact patterns, but output doesn’t follow them
Missing explicit constraints: Constraints were in context but violated
Not using provided code: Relevant code included but model writes from scratch
Selective blindness: Model acknowledges context exists but doesn’t use it
Recency bias: Model heavily weights most recent context, ignores middle

Why This Happens

Attention Mechanism Limitations

Transformer attention has computational biases:

“`
Attention Score = softmax(Q × K^T / √d_k) × V

Where:

Q (Query) = What we’re looking for
K (Key) = What’s available in context
V (Value) = The content itself
“`

Positional encodings create implicit biases:

Beginning tokens: Strong signal from initial position embeddings
End tokens: Strong signal from recency in autoregressive attention
Middle tokens: Weaker relative position signals, competition from both ends

Long-Context Training Gaps

Even models trained on long contexts show the pattern because:

Training data has important information distributed differently
Attention patterns learned from typical document structures
Most training documents front-load key information
Autoregressive training creates recency effects

Mitigation Strategies

Strategy 1: Critical Content at Extremes

Place most important information at beginning and end:

“`typescript
interface ContextStructure {
beginning: string[]; // High attention zone (0-15%)
middle: string[]; // Lower attention zone (15-85%)
end: string[]; // High attention zone (85-100%)
}

function structureContext(items: ContextItem[]): ContextStructure {
// Sort by importance
const sorted = […items].sort((a, b) => b.importance – a.importance);

const total = sorted.length;
const extremeCount = Math.ceil(total * 0.3); // 30% at extremes

return {
beginning: sorted.slice(0, extremeCount / 2),
middle: sorted.slice(extremeCount / 2, -extremeCount / 2),
end: sorted.slice(-extremeCount / 2)
};
}
“`

Application:

“`markdown

BEGINNING: Critical constraints (high attention)

Return types must be Result<T, E>
Never throw exceptions
All functions must be pure

MIDDLE: Background context (lower attention OK)

Project history…
Architecture documentation…
Less critical utilities…

END: Task-specific requirements (high attention)

The specific function to implement
Required tests to pass
Current file context
“`

Strategy 2: Query-Anchored Context

Place task-relevant context near the query (end of prompt):

“`typescript
function buildQueryAnchoredContext(
staticContext: string,
relevantCode: string,
query: string
): string {
// Static context first (less critical positioning)
// Relevant code right before query (high attention zone)
return `
${staticContext}

Relevant Code (Read This Carefully)

${relevantCode}

Your Task

${query}
`;
}
“`

Before (Lost in middle):

“`
System prompt -> Types -> Utilities -> RELEVANT CODE -> More code -> Task
^^^^^^^^^^^^^^^^
(Position 50-60%, low attention)
“`

After (Query-anchored):

“`
System prompt -> Types -> Utilities -> More code -> RELEVANT CODE -> Task
^^^^^^^^^^^^^^^^
(Position 85-95%, high attention)
“`

Strategy 3: Progressive Disclosure

Don’t stuff everything into context. Load on-demand:

“`typescript
// BAD: Load all 50 files into middle of context
const context = allFiles.map(f => f.content).join(‘\n’);

// GOOD: Load only relevant files, position strategically
async function buildProgressiveContext(task: Task): Promise {
const coreContext = await loadCoreContext(); // Always at beginning
const relevantFiles = await findRelevantFiles(task); // Near end

// Skip irrelevant files entirely – they would just fill the middle
return `
${coreContext}

Files Relevant to This Task

${relevantFiles.map(f => f.content).join(‘\n’)}

Task

${task.description}
`;
}
“`

Strategy 4: Chunking with Summaries

Break long content into chunks with summary headers:

“`typescript
function chunkWithSummaries(
content: string,
chunkSize: number = 500
): string {
const chunks = splitIntoChunks(content, chunkSize);

return chunks.map((chunk, i) => {
const summary = generateSummary(chunk); // Brief summary
return `

Chunk ${i + 1}: ${summary}

${chunk}
`;
}).join(‘\n’);
}
“`

Why this helps:

Summaries at chunk boundaries act as attention anchors
Model can scan summaries even if chunk content is in low-attention zone
Reduces effective “middle” by creating mini-beginnings throughout

Strategy 5: Repeated Key Information

Repeat critical constraints at multiple positions:

“`typescript
const constraints = `
CRITICAL: Return Result<T, E>, never throw exceptions.
`;

function buildContextWithRepeats(
staticContext: string,
code: string,
task: string
): string {
return `

${constraints}

${staticContext}

${constraints}

Code

${code}

${constraints}

Task

${task}
`;
}
“`

Strategy 6: Attention Anchors

Use formatting that naturally draws attention:

“`markdown

⚠️ CRITICAL CONSTRAINT ⚠️

Never use throw statements. Return Result types only.

🎯 RELEVANT CODE 🎯

“`typescript
// This is the function you need to modify
function processUser(user: User): Result<ProcessedUser, Error> {
// …
}
“`

📋 YOUR TASK 📋

Implement the missing validation logic.
“`

Symbols and formatting create visual anchors that influence attention patterns.

Strategy 7: Hierarchical Context with Pointers

Use pointers from high-attention zones to middle content:

“`markdown

System Prompt (Beginning – High Attention)

Key patterns are defined in Section 3 below. You MUST follow them.

Section 1: Background

…general info…

Section 2: Architecture

…architecture details…

Section 3: PATTERNS TO FOLLOW (Critical)

…the important patterns…

Section 4: Task (End – High Attention)

Implement following the patterns in Section 3.
“`

The pointer from beginning creates a mental link that increases attention to Section 3.

Implementation: Context Position Optimizer

“`typescript
interface ContextItem {
content: string;
tokens: number;
importance: number; // 0-1, higher = more important
type: ‘constraint’ | ‘example’ | ‘code’ | ‘documentation’ | ‘task’;
}

interface PositionedContext {
items: Array<{ item: ContextItem; position: number }>;
totalTokens: number;
attentionCoverage: number; // How much important content is in high-attention zones
}

function optimizeContextPosition(
items: ContextItem[],
maxTokens: number
): PositionedContext {
// Filter to fit budget
const sorted = […items].sort((a, b) => b.importance – a.importance);
const selected: ContextItem[] = [];
let totalTokens = 0;

for (const item of sorted) {
if (totalTokens + item.tokens <= maxTokens) {
selected.push(item);
totalTokens += item.tokens;
}
}

// Position optimization
const highAttentionItems = selected.filter(i => i.importance > 0.7);
const mediumItems = selected.filter(i => i.importance <= 0.7 && i.importance > 0.4);
const lowItems = selected.filter(i => i.importance <= 0.4);

// Constraints always at beginning
const constraints = highAttentionItems.filter(i => i.type === ‘constraint’);

// Task always at end
const tasks = highAttentionItems.filter(i => i.type === ‘task’);

// Examples and code near end (high attention)
const examples = highAttentionItems.filter(i =>
i.type === ‘example’ || i.type === ‘code’
);

// Build positioned context
const positioned: ContextItem[] = [
…constraints, // Beginning: constraints
…lowItems, // Middle: low importance (OK to have reduced attention)
…mediumItems, // Middle-end transition
…examples, // Near end: critical code/examples
…tasks // End: task description
];

// Calculate positions and attention coverage
let currentPosition = 0;
const result: Array<{ item: ContextItem; position: number }> = [];

for (const item of positioned) {
const startPos = currentPosition / totalTokens;
result.push({ item, position: startPos });
currentPosition += item.tokens;
}

// Calculate how much important content is in high-attention zones
const highAttentionThreshold = 0.15; // 0-15% and 85-100%
const importantInHighAttention = result.filter(({ item, position }) =>
item.importance > 0.7 &&
(position < highAttentionThreshold || position > (1 – highAttentionThreshold))
);

const totalImportant = selected.filter(i => i.importance > 0.7)
.reduce((sum, i) => sum + i.tokens, 0);
const importantInZones = importantInHighAttention
.reduce((sum, { item }) => sum + item.tokens, 0);

return {
items: result,
totalTokens,
attentionCoverage: totalImportant > 0 ? importantInZones / totalImportant : 1
};
}
“`

Measuring Effectiveness

Metric 1: Position Score

“`typescript
function calculatePositionScore(items: PositionedItem[]): number {
let score = 0;
let maxScore = 0;

for (const { item, position } of items) {
const importance = item.importance;
maxScore += importance;

// High attention zones: 0-15% and 85-100%
if (position < 0.15 || position > 0.85) {
  score += importance * 1.0;  // Full credit
} else if (position < 0.30 || position > 0.70) {
  score += importance * 0.7;  // Partial credit
} else {
  score += importance * 0.4;  // Low credit (danger zone)
}

}

return maxScore > 0 ? score / maxScore : 1;
}

// Target: positionScore > 0.8
“`

Metric 2: Constraint Compliance Rate

“`typescript
interface ComplianceMetrics {
constraintsTotal: number;
constraintsFollowed: number;
complianceRate: number;
positionCorrelation: number; // Do well-positioned constraints have higher compliance?
}

// Track whether constraints at different positions are followed
// High correlation = position matters for your use case
“`

Metric 3: A/B Testing Position

“`typescript
async function abTestPositioning(
context: ContextItem[],
task: string,
numTrials: number = 10
): Promise<{ original: number; optimized: number }> {
let originalSuccess = 0;
let optimizedSuccess = 0;

for (let i = 0; i < numTrials; i++) {
// Original: naive ordering
const originalContext = context.map(c => c.content).join(‘\n’);
const originalOutput = await llm.generate(originalContext + ‘\n’ + task);
if (meetsRequirements(originalOutput)) originalSuccess++;

// Optimized: position-aware ordering
const optimized = optimizeContextPosition(context, 100000);
const optimizedContext = optimized.items.map(p => p.item.content).join('\n');
const optimizedOutput = await llm.generate(optimizedContext + '\n' + task);
if (meetsRequirements(optimizedOutput)) optimizedSuccess++;

}

return {
original: originalSuccess / numTrials,
optimized: optimizedSuccess / numTrials
};
}
“`

Best Practices

1. Audit Your Context Structure

“`typescript
function auditContextPositions(context: string, markers: string[]): void {
const totalLength = context.length;

for (const marker of markers) {
const position = context.indexOf(marker);
if (position === -1) {
console.log(`${marker}: NOT FOUND`);
} else {
const pct = (position / totalLength * 100).toFixed(1);
const zone = position / totalLength < 0.15 ? ‘HIGH’ :
position / totalLength > 0.85 ? ‘HIGH’ : ‘LOW’;
console.log(`${marker}: ${pct}% (${zone} attention zone)`);
}
}
}

// Usage:
auditContextPositions(context, [
‘CRITICAL CONSTRAINT:’,
‘function processUser’,
‘Your task is to’
]);
“`

2. Use Position-Aware Templates

“`typescript
const POSITION_OPTIMIZED_TEMPLATE = `

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

★ 4.5/5 rating

306,000+ learners

View Course

{{task}}
`;
“`

3. Avoid Middle-Heavy Context

“`typescript
// BAD: All important content in middle
const badContext = `
System prompt…
Type definitions…
${criticalConstraints} // Position: 40%
${relevantCode} // Position: 50%
${examples} // Position: 60%
More utilities…
Task…
`;

// GOOD: Important content at extremes
const goodContext = `
${criticalConstraints} // Position: 5%
System prompt…
Type definitions…
More utilities…
${relevantCode} // Position: 85%
${examples} // Position: 90%
${task} // Position: 95%
`;
“`

4. Test Position Sensitivity

Before deploying, test if your specific task is position-sensitive:

“`typescript
async function testPositionSensitivity(
content: string,
task: string
): Promise {
// Test same content at different positions
const results = await Promise.all([
llm.generate(`${content}\n\n${task}`), // Content at start
llm.generate(`Padding…\n${content}\n\nPadding…\n${task}`), // Content in middle
llm.generate(`Padding…\n\n${task}\n\n${content}`) // Content at end
]);

// If outputs differ significantly, position matters
const unique = new Set(results.map(normalizeOutput)).size;
return unique > 1;
}
“`

Common Pitfalls

Pitfall 1: Assuming Position Doesn’t Matter

“`typescript
// BAD: Random/chronological ordering
const context = files.map(f => f.content).join(‘\n’);

// GOOD: Intentional positioning
const context = buildPositionOptimizedContext(files, task);
“`

Pitfall 2: Overloading the Middle

“`typescript
// BAD: Stuffing everything because “it’s in context”
const context = `
${systemPrompt}
${allDocumentation} // 10K tokens in middle
${allUtilities} // 15K tokens in middle
${task}
`;

// GOOD: Only include what’s needed, position strategically
const context = `
${systemPrompt}
${relevantDocumentation} // Only 2K tokens
${relevantCode} // Near end
${task}
`;
“`

Pitfall 3: Not Repeating Critical Constraints

“`typescript
// BAD: Constraint mentioned once in middle
// GOOD: Constraint at beginning AND near task
“`

Pitfall 4: Ignoring Context Length Effects

The lost-in-the-middle effect is worse with longer contexts:

“`typescript
// Short context (5K tokens): Middle still gets ~60% attention
// Long context (100K tokens): Middle may get <30% attention

// Adjust strategy based on context length
function getPositioningStrategy(contextTokens: number) {
if (contextTokens < 10000) {
return ‘standard’; // Less aggressive positioning needed
} else if (contextTokens < 50000) {
return ‘optimized’; // Use position optimization
} else {
return ‘aggressive’; // Repeat constraints, minimal middle content
}
}
“`

Conclusion

The lost-in-the-middle effect is a real and measurable phenomenon. Information position in context directly impacts whether it gets used. By understanding and optimizing for attention patterns, you can significantly improve LLM output quality.

Key Takeaways:

U-shaped attention: Beginning and end get most attention, middle gets less
Position critical content: Constraints at start, task-relevant code near end
Avoid middle-stuffing: Progressive disclosure beats context cramming
Use anchors: Summaries, formatting, and repetition boost attention
Measure position score: Track how much important content is in high-attention zones
Test sensitivity: Not all tasks are equally position-sensitive

The difference between “the model ignored my context” and “the model followed my context perfectly” is often just position.

Context Rot Auto-Compacting – Temporal context degradation (different from positional)
Progressive Disclosure Context – Load context on-demand to avoid middle-stuffing
Token Budgeting Strategies – Allocate tokens wisely
Information Theory for Coding Agents – Why position affects information transfer
Hierarchical Context Patterns – Structure that aids attention
Prompt Caching Strategy – Cache stable context at optimal positions
Context Debugging Framework – Debug when context is ignored

References

Lost in the Middle: How Language Models Use Long Contexts – Liu et al., 2023. The foundational research on this phenomenon.
Anthropic Prompt Engineering Guide – Best practices for context structure
Attention Is All You Need – Vaswani et al. Original transformer paper explaining attention mechanisms

Lost in the Middle: Preventing Context Window Attention Degradation

Summary

The Problem

The U-Shaped Attention Curve

Real-World Impact

Symptoms of Lost-in-the-Middle

Why This Happens

Attention Mechanism Limitations

Long-Context Training Gaps

Mitigation Strategies

Strategy 1: Critical Content at Extremes

BEGINNING: Critical constraints (high attention)

MIDDLE: Background context (lower attention OK)

END: Task-specific requirements (high attention)

Strategy 2: Query-Anchored Context

Relevant Code (Read This Carefully)

Your Task

Strategy 3: Progressive Disclosure

Files Relevant to This Task

Task

Strategy 4: Chunking with Summaries

Chunk ${i + 1}: ${summary}

Strategy 5: Repeated Key Information

Code

Task

Strategy 6: Attention Anchors

⚠️ CRITICAL CONSTRAINT ⚠️

🎯 RELEVANT CODE 🎯

📋 YOUR TASK 📋

Strategy 7: Hierarchical Context with Pointers

System Prompt (Beginning – High Attention)

Section 1: Background

Section 2: Architecture

Section 3: PATTERNS TO FOLLOW (Critical)

Section 4: Task (End – High Attention)

Implementation: Context Position Optimizer

Measuring Effectiveness

Metric 1: Position Score

Metric 2: Constraint Compliance Rate

Metric 3: A/B Testing Position

Best Practices

1. Audit Your Context Structure

2. Use Position-Aware Templates

Learn Prompt Engineering

3. Avoid Middle-Heavy Context

4. Test Position Sensitivity

Common Pitfalls

Pitfall 1: Assuming Position Doesn’t Matter

Pitfall 2: Overloading the Middle

Pitfall 3: Not Repeating Critical Constraints

Pitfall 4: Ignoring Context Length Effects

Conclusion

Related

References

More Insights

Own Your Control Plane

Indexed PRD and Design Doc Strategy