Summary
Large language models exhibit a U-shaped attention pattern: they attend most strongly to information at the beginning and end of their context window, while information in the middle receives reduced attention. This “lost in the middle” effect causes relevant information to be missed even when present in context. Mitigation strategies include positioning critical content at extremes, using progressive disclosure to avoid middle-stuffing, chunking with summaries, and query-anchored context placement.
The Problem
You’ve carefully curated your context. The relevant code is included. The documentation is there. The examples are perfect. Yet the LLM ignores crucial information and generates incorrect output.
The culprit is often not what you included, but where you placed it.
The U-Shaped Attention Curve
Research from “Lost in the Middle: How Language Models Use Long Contexts” (Liu et al., 2023) documented a striking pattern:
“`
Attention Strength
^
| ████ ████
| ████ ████
| ████ ████
| ████ ▓▓▓▓ ▓▓▓▓████
| ████ ▓▓▓▓ ░░░░░░░░░ ▓▓▓▓████
| ████ ▓▓▓▓ ░░░░░░░░░ ▓▓▓▓████
+———————————–> Position
Start Middle End
████ = High attention (80-100%)
▓▓▓▓ = Medium attention (50-80%)
░░░░ = Low attention (20-50%)
“`
Key findings:
- Information at positions 1-10% and 90-100% is processed most accurately
- Information at positions 40-60% (middle) shows significant accuracy degradation
- Effect is more pronounced as context length increases
- Models trained on long contexts still exhibit the pattern
Real-World Impact
Scenario: 20K token context with relevant code in middle
“`
Context Structure:
├── System prompt (1K tokens) – Position: 0-5%
├── CLAUDE.md (2K tokens) – Position: 5-15%
├── Type definitions (3K tokens) – Position: 15-30%
├── Irrelevant utility files (5K tokens) – Position: 30-55%
├── THE RELEVANT CODE (2K tokens) – Position: 55-65% ← IN THE DANGER ZONE
├── More utilities (4K tokens) – Position: 65-85%
└── Test examples (3K tokens) – Position: 85-100%
Result: Model may miss the relevant code entirely!
“`
Symptoms of Lost-in-the-Middle
- Ignoring provided examples: You included exact patterns, but output doesn’t follow them
- Missing explicit constraints: Constraints were in context but violated
- Not using provided code: Relevant code included but model writes from scratch
- Selective blindness: Model acknowledges context exists but doesn’t use it
- Recency bias: Model heavily weights most recent context, ignores middle
Why This Happens
Attention Mechanism Limitations
Transformer attention has computational biases:
“`
Attention Score = softmax(Q × K^T / √d_k) × V
Where:
- Q (Query) = What we’re looking for
- K (Key) = What’s available in context
- V (Value) = The content itself
“`
Positional encodings create implicit biases:
- Beginning tokens: Strong signal from initial position embeddings
- End tokens: Strong signal from recency in autoregressive attention
- Middle tokens: Weaker relative position signals, competition from both ends
Long-Context Training Gaps
Even models trained on long contexts show the pattern because:
- Training data has important information distributed differently
- Attention patterns learned from typical document structures
- Most training documents front-load key information
- Autoregressive training creates recency effects
Mitigation Strategies
Strategy 1: Critical Content at Extremes
Place most important information at beginning and end:
“`typescript
interface ContextStructure {
beginning: string[]; // High attention zone (0-15%)
middle: string[]; // Lower attention zone (15-85%)
end: string[]; // High attention zone (85-100%)
}
function structureContext(items: ContextItem[]): ContextStructure {
// Sort by importance
const sorted = […items].sort((a, b) => b.importance – a.importance);
const total = sorted.length;
const extremeCount = Math.ceil(total * 0.3); // 30% at extremes
return {
beginning: sorted.slice(0, extremeCount / 2),
middle: sorted.slice(extremeCount / 2, -extremeCount / 2),
end: sorted.slice(-extremeCount / 2)
};
}
“`
Application:
“`markdown
BEGINNING: Critical constraints (high attention)
- Return types must be Result<T, E>
- Never throw exceptions
- All functions must be pure
MIDDLE: Background context (lower attention OK)
- Project history…
- Architecture documentation…
- Less critical utilities…
END: Task-specific requirements (high attention)
- The specific function to implement
- Required tests to pass
- Current file context
“`
Strategy 2: Query-Anchored Context
Place task-relevant context near the query (end of prompt):
“`typescript
function buildQueryAnchoredContext(
staticContext: string,
relevantCode: string,
query: string
): string {
// Static context first (less critical positioning)
// Relevant code right before query (high attention zone)
return `
${staticContext}
Relevant Code (Read This Carefully)
${relevantCode}
Your Task
${query}
`;
}
“`
Before (Lost in middle):
“`
System prompt -> Types -> Utilities -> RELEVANT CODE -> More code -> Task
^^^^^^^^^^^^^^^^
(Position 50-60%, low attention)
“`
After (Query-anchored):
“`
System prompt -> Types -> Utilities -> More code -> RELEVANT CODE -> Task
^^^^^^^^^^^^^^^^
(Position 85-95%, high attention)
“`
Strategy 3: Progressive Disclosure
Don’t stuff everything into context. Load on-demand:
“`typescript
// BAD: Load all 50 files into middle of context
const context = allFiles.map(f => f.content).join(‘\n’);
// GOOD: Load only relevant files, position strategically
async function buildProgressiveContext(task: Task): Promise
const coreContext = await loadCoreContext(); // Always at beginning
const relevantFiles = await findRelevantFiles(task); // Near end
// Skip irrelevant files entirely – they would just fill the middle
return `
${coreContext}
Files Relevant to This Task
${relevantFiles.map(f => f.content).join(‘\n’)}
Task
${task.description}
`;
}
“`
Strategy 4: Chunking with Summaries
Break long content into chunks with summary headers:
“`typescript
function chunkWithSummaries(
content: string,
chunkSize: number = 500
): string {
const chunks = splitIntoChunks(content, chunkSize);
return chunks.map((chunk, i) => {
const summary = generateSummary(chunk); // Brief summary
return `
Chunk ${i + 1}: ${summary}
${chunk}
`;
}).join(‘\n’);
}
“`
Why this helps:
- Summaries at chunk boundaries act as attention anchors
- Model can scan summaries even if chunk content is in low-attention zone
- Reduces effective “middle” by creating mini-beginnings throughout
Strategy 5: Repeated Key Information
Repeat critical constraints at multiple positions:
“`typescript
const constraints = `
CRITICAL: Return Result<T, E>, never throw exceptions.
`;
function buildContextWithRepeats(
staticContext: string,
code: string,
task: string
): string {
return `
${constraints}
${staticContext}
${constraints}
Code
${code}
${constraints}
Task
${task}
`;
}
“`
Strategy 6: Attention Anchors
Use formatting that naturally draws attention:
“`markdown
⚠️ CRITICAL CONSTRAINT ⚠️
Never use throw statements. Return Result types only.
🎯 RELEVANT CODE 🎯
“`typescript
// This is the function you need to modify
function processUser(user: User): Result<ProcessedUser, Error> {
// …
}
“`
📋 YOUR TASK 📋
Implement the missing validation logic.
“`
Symbols and formatting create visual anchors that influence attention patterns.
Strategy 7: Hierarchical Context with Pointers
Use pointers from high-attention zones to middle content:
“`markdown
System Prompt (Beginning – High Attention)
Key patterns are defined in Section 3 below. You MUST follow them.
Section 1: Background
…general info…
Section 2: Architecture
…architecture details…
Section 3: PATTERNS TO FOLLOW (Critical)
…the important patterns…
Section 4: Task (End – High Attention)
Implement following the patterns in Section 3.
“`
The pointer from beginning creates a mental link that increases attention to Section 3.
Implementation: Context Position Optimizer
“`typescript
interface ContextItem {
content: string;
tokens: number;
importance: number; // 0-1, higher = more important
type: ‘constraint’ | ‘example’ | ‘code’ | ‘documentation’ | ‘task’;
}
interface PositionedContext {
items: Array<{ item: ContextItem; position: number }>;
totalTokens: number;
attentionCoverage: number; // How much important content is in high-attention zones
}
function optimizeContextPosition(
items: ContextItem[],
maxTokens: number
): PositionedContext {
// Filter to fit budget
const sorted = […items].sort((a, b) => b.importance – a.importance);
const selected: ContextItem[] = [];
let totalTokens = 0;
for (const item of sorted) {
if (totalTokens + item.tokens <= maxTokens) {
selected.push(item);
totalTokens += item.tokens;
}
}
// Position optimization
const highAttentionItems = selected.filter(i => i.importance > 0.7);
const mediumItems = selected.filter(i => i.importance <= 0.7 && i.importance > 0.4);
const lowItems = selected.filter(i => i.importance <= 0.4);
// Constraints always at beginning
const constraints = highAttentionItems.filter(i => i.type === ‘constraint’);
// Task always at end
const tasks = highAttentionItems.filter(i => i.type === ‘task’);
// Examples and code near end (high attention)
const examples = highAttentionItems.filter(i =>
i.type === ‘example’ || i.type === ‘code’
);
// Build positioned context
const positioned: ContextItem[] = [
…constraints, // Beginning: constraints
…lowItems, // Middle: low importance (OK to have reduced attention)
…mediumItems, // Middle-end transition
…examples, // Near end: critical code/examples
…tasks // End: task description
];
// Calculate positions and attention coverage
let currentPosition = 0;
const result: Array<{ item: ContextItem; position: number }> = [];
for (const item of positioned) {
const startPos = currentPosition / totalTokens;
result.push({ item, position: startPos });
currentPosition += item.tokens;
}
// Calculate how much important content is in high-attention zones
const highAttentionThreshold = 0.15; // 0-15% and 85-100%
const importantInHighAttention = result.filter(({ item, position }) =>
item.importance > 0.7 &&
(position < highAttentionThreshold || position > (1 – highAttentionThreshold))
);
const totalImportant = selected.filter(i => i.importance > 0.7)
.reduce((sum, i) => sum + i.tokens, 0);
const importantInZones = importantInHighAttention
.reduce((sum, { item }) => sum + item.tokens, 0);
return {
items: result,
totalTokens,
attentionCoverage: totalImportant > 0 ? importantInZones / totalImportant : 1
};
}
“`
Measuring Effectiveness
Metric 1: Position Score
“`typescript
function calculatePositionScore(items: PositionedItem[]): number {
let score = 0;
let maxScore = 0;
for (const { item, position } of items) {
const importance = item.importance;
maxScore += importance;
// High attention zones: 0-15% and 85-100%
if (position < 0.15 || position > 0.85) {
score += importance * 1.0; // Full credit
} else if (position < 0.30 || position > 0.70) {
score += importance * 0.7; // Partial credit
} else {
score += importance * 0.4; // Low credit (danger zone)
}
}
return maxScore > 0 ? score / maxScore : 1;
}
// Target: positionScore > 0.8
“`
Metric 2: Constraint Compliance Rate
“`typescript
interface ComplianceMetrics {
constraintsTotal: number;
constraintsFollowed: number;
complianceRate: number;
positionCorrelation: number; // Do well-positioned constraints have higher compliance?
}
// Track whether constraints at different positions are followed
// High correlation = position matters for your use case
“`
Metric 3: A/B Testing Position
“`typescript
async function abTestPositioning(
context: ContextItem[],
task: string,
numTrials: number = 10
): Promise<{ original: number; optimized: number }> {
let originalSuccess = 0;
let optimizedSuccess = 0;
for (let i = 0; i < numTrials; i++) {
// Original: naive ordering
const originalContext = context.map(c => c.content).join(‘\n’);
const originalOutput = await llm.generate(originalContext + ‘\n’ + task);
if (meetsRequirements(originalOutput)) originalSuccess++;
// Optimized: position-aware ordering
const optimized = optimizeContextPosition(context, 100000);
const optimizedContext = optimized.items.map(p => p.item.content).join('\n');
const optimizedOutput = await llm.generate(optimizedContext + '\n' + task);
if (meetsRequirements(optimizedOutput)) optimizedSuccess++;
}
return {
original: originalSuccess / numTrials,
optimized: optimizedSuccess / numTrials
};
}
“`
Best Practices
1. Audit Your Context Structure
“`typescript
function auditContextPositions(context: string, markers: string[]): void {
const totalLength = context.length;
for (const marker of markers) {
const position = context.indexOf(marker);
if (position === -1) {
console.log(`${marker}: NOT FOUND`);
} else {
const pct = (position / totalLength * 100).toFixed(1);
const zone = position / totalLength < 0.15 ? ‘HIGH’ :
position / totalLength > 0.85 ? ‘HIGH’ : ‘LOW’;
console.log(`${marker}: ${pct}% (${zone} attention zone)`);
}
}
}
// Usage:
auditContextPositions(context, [
‘CRITICAL CONSTRAINT:’,
‘function processUser’,
‘Your task is to’
]);
“`
2. Use Position-Aware Templates
“`typescript
const POSITION_OPTIMIZED_TEMPLATE = `
{{constraints}}
{{background}}
{{documentation}}
{{utilities}}
{{relevantCode}}
{{examples}}
{{task}}
`;
“`
3. Avoid Middle-Heavy Context
“`typescript
// BAD: All important content in middle
const badContext = `
System prompt…
Type definitions…
${criticalConstraints} // Position: 40%
${relevantCode} // Position: 50%
${examples} // Position: 60%
More utilities…
Task…
`;
// GOOD: Important content at extremes
const goodContext = `
${criticalConstraints} // Position: 5%
System prompt…
Type definitions…
More utilities…
${relevantCode} // Position: 85%
${examples} // Position: 90%
${task} // Position: 95%
`;
“`
4. Test Position Sensitivity
Before deploying, test if your specific task is position-sensitive:
“`typescript
async function testPositionSensitivity(
content: string,
task: string
): Promise
// Test same content at different positions
const results = await Promise.all([
llm.generate(`${content}\n\n${task}`), // Content at start
llm.generate(`Padding…\n${content}\n\nPadding…\n${task}`), // Content in middle
llm.generate(`Padding…\n\n${task}\n\n${content}`) // Content at end
]);
// If outputs differ significantly, position matters
const unique = new Set(results.map(normalizeOutput)).size;
return unique > 1;
}
“`
Common Pitfalls
Pitfall 1: Assuming Position Doesn’t Matter
“`typescript
// BAD: Random/chronological ordering
const context = files.map(f => f.content).join(‘\n’);
// GOOD: Intentional positioning
const context = buildPositionOptimizedContext(files, task);
“`
Pitfall 2: Overloading the Middle
“`typescript
// BAD: Stuffing everything because “it’s in context”
const context = `
${systemPrompt}
${allDocumentation} // 10K tokens in middle
${allUtilities} // 15K tokens in middle
${task}
`;
// GOOD: Only include what’s needed, position strategically
const context = `
${systemPrompt}
${relevantDocumentation} // Only 2K tokens
${relevantCode} // Near end
${task}
`;
“`
Pitfall 3: Not Repeating Critical Constraints
“`typescript
// BAD: Constraint mentioned once in middle
// GOOD: Constraint at beginning AND near task
“`
Pitfall 4: Ignoring Context Length Effects
The lost-in-the-middle effect is worse with longer contexts:
“`typescript
// Short context (5K tokens): Middle still gets ~60% attention
// Long context (100K tokens): Middle may get <30% attention
// Adjust strategy based on context length
function getPositioningStrategy(contextTokens: number) {
if (contextTokens < 10000) {
return ‘standard’; // Less aggressive positioning needed
} else if (contextTokens < 50000) {
return ‘optimized’; // Use position optimization
} else {
return ‘aggressive’; // Repeat constraints, minimal middle content
}
}
“`
Conclusion
The lost-in-the-middle effect is a real and measurable phenomenon. Information position in context directly impacts whether it gets used. By understanding and optimizing for attention patterns, you can significantly improve LLM output quality.
Key Takeaways:
- U-shaped attention: Beginning and end get most attention, middle gets less
- Position critical content: Constraints at start, task-relevant code near end
- Avoid middle-stuffing: Progressive disclosure beats context cramming
- Use anchors: Summaries, formatting, and repetition boost attention
- Measure position score: Track how much important content is in high-attention zones
- Test sensitivity: Not all tasks are equally position-sensitive
The difference between “the model ignored my context” and “the model followed my context perfectly” is often just position.
Related
- Context Rot Auto-Compacting – Temporal context degradation (different from positional)
- Progressive Disclosure Context – Load context on-demand to avoid middle-stuffing
- Token Budgeting Strategies – Allocate tokens wisely
- Information Theory for Coding Agents – Why position affects information transfer
- Hierarchical Context Patterns – Structure that aids attention
- Prompt Caching Strategy – Cache stable context at optimal positions
- Context Debugging Framework – Debug when context is ignored
References
- Lost in the Middle: How Language Models Use Long Contexts – Liu et al., 2023. The foundational research on this phenomenon.
- Anthropic Prompt Engineering Guide – Best practices for context structure
- Attention Is All You Need – Vaswani et al. Original transformer paper explaining attention mechanisms

