Context Pollution Recovery: Diagnosing and Fixing Degraded AI Sessions

James Phoenix
James Phoenix

Summary

Context pollution occurs when accumulated noise, contradictions, or stale information in an AI session degrades output quality. This article provides a systematic approach to detect pollution symptoms, diagnose root causes, and recover through compacting or clean slate techniques. Recovery strategies depend on pollution severity: mild pollution responds to targeted compacting, while severe pollution requires starting fresh with preserved learnings.

The Problem

Long-running AI coding sessions accumulate context that becomes counterproductive. The AI starts producing worse output than when the session began. Developers recognize something is wrong but lack systematic methods to diagnose whether the issue is context pollution (fixable), model limitations (requires escalation), or task complexity (requires decomposition).

The Solution

Implement a three-stage recovery protocol: (1) Detection through symptom recognition and diagnostic queries, (2) Classification by pollution severity (mild, moderate, severe), and (3) Recovery using appropriate techniques (targeted pruning, compacting, or clean slate). Each severity level has a corresponding recovery strategy that balances context preservation against pollution elimination.

Understanding Context Pollution

Context pollution differs from simple context growth. Growth adds relevant information; pollution adds noise that actively interferes with generation quality.

The Pollution Spectrum

Clean Context          Polluted Context           Toxic Context
     |                      |                          |
     v                      v                          v
All information        Mix of valid and          Contradictions
supports task          obsolete information      dominate context

Signal: 100%           Signal: 40-60%            Signal: <20%
Noise: 0%              Noise: 40-60%             Noise: >80%

Sources of Context Pollution

Source Description Example
Obsolete code References to deleted implementations “Using the Redis cache we set up” (deleted 50 messages ago)
Abandoned approaches Failed strategies still in context Multiple debugging attempts for abandoned feature
Contradictory decisions Old decisions conflicting with new “Use monolith” vs later “Use microservices”
Stale architecture Outdated system descriptions Old API structure when endpoints changed
Debugging noise Stack traces, error logs 20 iterations of failed tests
Scope creep Unrelated discussions Off-topic feature discussions

Stage 1: Detecting Context Pollution

Symptom Recognition

Watch for these warning signs that indicate context pollution:

interface PollutionSymptoms {
  // High confidence indicators
  highConfidence: [
    "AI references deleted files or code",
    "AI suggests previously-abandoned approaches",
    "AI confuses old architecture with current",
    "Output quality decreasing over time",
    "Frequent need to correct AI about current state"
  ];

  // Medium confidence indicators
  mediumConfidence: [
    "AI responses becoming longer without more substance",
    "AI hedging or expressing uncertainty about things it knew earlier",
    "Suggestions feel generic rather than project-specific",
    "AI asks questions it asked before"
  ];

  // Low confidence indicators (may be other issues)
  lowConfidence: [
    "Slower response times",
    "AI declining to perform tasks",
    "Responses feel repetitive"
  ];
}

Diagnostic Queries

Run these diagnostic prompts to assess context state:

const diagnosticQueries = {
  // Test for stale references
  staleReferenceTest: `
    What files and functions are we currently working with?
    List only active, existing code - not deleted or planned.
  `,

  // Test for architectural consistency
  architectureTest: `
    Describe the current system architecture in 3 sentences.
    What are the key components and how do they connect?
  `,

  // Test for decision consistency
  decisionTest: `
    What key technical decisions have we made in this session?
    List them in chronological order with current status (active/superseded).
  `,

  // Test for context clarity
  clarityTest: `
    Rate your confidence (1-10) in understanding:
    - Current task requirements
    - Codebase structure
    - What's been completed vs pending
  `
};

// Interpreting results
function interpretDiagnostics(responses: Record<string, string>): PollutionLevel {
  const issues: string[] = [];

  // Check for deleted file references
  if (responses.staleReferenceTest.includes("deleted") ||
      responses.staleReferenceTest.includes("removed")) {
    issues.push("stale_references");
  }

  // Check for architectural confusion
  const architectureWords = responses.architectureTest.split(" ");
  const contradictions = ["but", "however", "although", "previously"];
  if (contradictions.some(word => architectureWords.includes(word))) {
    issues.push("architectural_confusion");
  }

  // Check for superseded decisions still active
  if (responses.decisionTest.includes("superseded") &&
      responses.decisionTest.split("superseded").length > 2) {
    issues.push("decision_conflicts");
  }

  // Check confidence levels
  const confidenceMatch = responses.clarityTest.match(/\d+/g);
  if (confidenceMatch) {
    const avgConfidence = confidenceMatch
      .map(Number)
      .reduce((a, b) => a + b, 0) / confidenceMatch.length;
    if (avgConfidence < 6) {
      issues.push("low_confidence");
    }
  }

  // Classify pollution level
  if (issues.length >= 3) return "severe";
  if (issues.length >= 1) return "moderate";
  return "mild";
}

type PollutionLevel = "mild" | "moderate" | "severe";

Automated Pollution Detection

Build pollution detection into your workflow:

interface ContextHealthCheck {
  messageCount: number;
  tokenEstimate: number;
  lastCompacted: Date | null;
  staleReferences: number;
  contradictions: number;
  debuggingNoise: number; // Messages that are error logs, stack traces
  pollutionScore: number; // 0-100, higher = more polluted
}

function assessContextHealth(
  messages: Array<{ role: string; content: string }>,
  currentFiles: Set<string>
): ContextHealthCheck {
  let staleReferences = 0;
  let contradictions = 0;
  let debuggingNoise = 0;

  const seenDecisions: Map<string, string> = new Map();
  const errorPatterns = [
    /Error:/i,
    /at\s+\w+\s+\(/,  // Stack trace pattern
    /failed/i,
    /exception/i
  ];

  for (const msg of messages) {
    // Count stale file references
    const fileRefs = msg.content.match(/[\w-]+\.(ts|js|tsx|jsx|md|json)/g) || [];
    for (const ref of fileRefs) {
      if (!currentFiles.has(ref) && !msg.content.includes("create") &&
          !msg.content.includes("deleted")) {
        staleReferences++;
      }
    }

    // Detect debugging noise
    if (errorPatterns.some(pattern => pattern.test(msg.content))) {
      debuggingNoise++;
    }

    // Track decision contradictions
    const decisionMatch = msg.content.match(/decided to (use|implement|switch to) (\w+)/i);
    if (decisionMatch) {
      const topic = decisionMatch[2].toLowerCase();
      if (seenDecisions.has(topic) && seenDecisions.get(topic) !== msg.content) {
        contradictions++;
      }
      seenDecisions.set(topic, msg.content);
    }
  }

  const tokenEstimate = messages
    .reduce((sum, m) => sum + m.content.length / 4, 0);

  // Calculate pollution score (0-100)
  const pollutionScore = Math.min(100, Math.round(
    (staleReferences * 10) +
    (contradictions * 15) +
    (debuggingNoise * 5) +
    (messages.length > 100 ? 20 : messages.length / 5)
  ));

  return {
    messageCount: messages.length,
    tokenEstimate: Math.round(tokenEstimate),
    lastCompacted: null, // Track separately
    staleReferences,
    contradictions,
    debuggingNoise,
    pollutionScore
  };
}

// Usage in agent workflow
function shouldRecover(health: ContextHealthCheck): {
  action: "continue" | "compact" | "clean_slate";
  reason: string;
} {
  if (health.pollutionScore >= 70) {
    return {
      action: "clean_slate",
      reason: `Severe pollution (score: ${health.pollutionScore}). ` +
              `${health.staleReferences} stale refs, ${health.contradictions} contradictions.`
    };
  }

  if (health.pollutionScore >= 40) {
    return {
      action: "compact",
      reason: `Moderate pollution (score: ${health.pollutionScore}). Compacting recommended.`
    };
  }

  if (health.messageCount > 100 && health.pollutionScore >= 20) {
    return {
      action: "compact",
      reason: `Session length (${health.messageCount} msgs) with mild pollution. Preventive compacting.`
    };
  }

  return {
    action: "continue",
    reason: `Context healthy (score: ${health.pollutionScore}). No recovery needed.`
  };
}

Stage 2: Classifying Pollution Severity

Mild Pollution (Score 0-30)

Characteristics:

  • Few stale references (1-2)
  • No contradictory decisions
  • Output quality stable
  • Minor debugging noise

Recovery: Targeted pruning or continue with monitoring

Moderate Pollution (Score 31-60)

Characteristics:

  • Multiple stale references (3-5)
  • Some contradictory decisions
  • Output quality noticeably degraded
  • Significant debugging noise

Recovery: Structured compacting

Severe Pollution (Score 61-100)

Characteristics:

  • Many stale references (6+)
  • Fundamental architectural confusion
  • Output quality severely degraded
  • Context dominated by noise

Recovery: Clean slate with preserved learnings

Stage 3: Recovery Techniques

Technique 1: Targeted Pruning (Mild Pollution)

Remove specific polluting content while preserving useful context:

interface PruningTarget {
  type: "stale_reference" | "debugging_noise" | "abandoned_approach";
  description: string;
  messageIndices: number[];
}

const pruningPrompt = `
Review the session and identify content to prune:

1. Stale References: Code or files that no longer exist
2. Debugging Noise: Error logs, stack traces from resolved issues
3. Abandoned Approaches: Strategies we explicitly decided against

For each, note what it was so we remember NOT to revisit it.

Output format:
## Pruning Summary
- Removed references to: [list deleted files/code]
- Cleared debugging for: [list resolved issues]
- Marked as abandoned: [list approaches]

## Current Valid State
- Active files: [list]
- Current architecture: [summary]
- Active decisions: [list]
`;

// The AI applies this pruning and provides a clean summary

Technique 2: Structured Compacting (Moderate Pollution)

Compress the entire session into a focused summary:

const compactingPrompt = `
Compact this session into a focused summary. The goal is to eliminate
pollution while preserving essential context.

## What to PRESERVE:
- Current file states (only files that exist now)
- Active architectural decisions (not superseded ones)
- Implementation requirements still pending
- Key learnings from completed work

## What to ELIMINATE:
- References to deleted code
- Failed debugging attempts
- Abandoned approaches (note them as "don't retry")
- Superseded decisions
- Off-topic discussions

## Output Format:

### Project State
- Repository: [name]
- Current branch: [branch]
- Active files: [list with brief descriptions]

### Architecture (Current)
[2-3 sentences describing current system design]

### Completed Work
[Bullet list of what's done, no implementation details]

### Pending Work
[Bullet list of remaining tasks with priority]

### Decisions in Effect
[List of active technical decisions with 1-line rationale]

### Approaches to Avoid
[List approaches we tried and abandoned, with why]

### Key Patterns
[Any patterns or conventions established in this session]

Keep the output under 1000 tokens. This becomes our new starting context.
`;

Technique 3: Clean Slate Recovery (Severe Pollution)

Start fresh while preserving learnings:

interface CleanSlatePackage {
  // What we learned (preserve)
  learnings: {
    whatWorked: string[];
    whatFailed: string[];
    keyDecisions: string[];
    discoveredPatterns: string[];
  };

  // Current state (verify before including)
  currentState: {
    activeFiles: string[];
    pendingTasks: string[];
    blockers: string[];
  };

  // Constraints for new session
  constraints: {
    mustDo: string[];
    mustAvoid: string[];
    styleGuidelines: string[];
  };
}

const cleanSlatePrompt = `
The current session has severe context pollution and must be abandoned.
Before starting fresh, extract the following:

## 1. LEARNINGS (What we discovered)

### What Worked
- [List successful approaches]

### What Failed (Do Not Retry)
- [List failed approaches with brief reason]

### Key Decisions Made
- [List decisions that should carry forward]

## 2. CURRENT STATE (Verify each item exists)

### Active Files
- [List only files that currently exist]

### Pending Tasks
- [List uncompleted work]

### Known Blockers
- [List any blocking issues]

## 3. NEW SESSION CONSTRAINTS

### Must Do
- [Requirements that must be followed]

### Must Avoid
- [Approaches explicitly ruled out]

### Style Guidelines
- [Patterns to maintain]

---

This summary will be the ONLY context for a fresh session.
Verify every file reference exists before including it.
`;

// Start new session with clean slate package
function startCleanSession(pkg: CleanSlatePackage): string {
  return `
# Fresh Session Context

## Background
Previous session was terminated due to context pollution.
This is a clean start with preserved learnings.

## What We Know Works
${pkg.learnings.whatWorked.map(w => `- ${w}`).join('\n')}

## Approaches to Avoid (Already Tried, Failed)
${pkg.learnings.whatFailed.map(f => `- ${f}`).join('\n')}

## Active Decisions
${pkg.learnings.keyDecisions.map(d => `- ${d}`).join('\n')}

## Current State
Files: ${pkg.currentState.activeFiles.join(', ')}
Pending: ${pkg.currentState.pendingTasks.join(', ')}

## Constraints
- Must: ${pkg.constraints.mustDo.join(', ')}
- Avoid: ${pkg.constraints.mustAvoid.join(', ')}

Ready to continue. What's the next task?
  `.trim();
}

Recovery Decision Tree

Context Pollution Detected
           |
           v
    Run Diagnostics
           |
           v
  Calculate Pollution Score
           |
     +-----+-----+
     |     |     |
     v     v     v
  0-30   31-60   61+
 (Mild) (Mod)  (Severe)
     |     |     |
     v     v     v
  Prune  Compact  Clean
           |      Slate
           v
    Update CLAUDE.md
    with learnings

Prevention Strategies

Proactive Compacting Schedule

const compactingSchedule = {
  // Time-based triggers
  everyNMessages: 80,
  everyNMinutes: 120, // 2 hours

  // Event-based triggers
  afterFeatureComplete: true,
  afterMajorDecisionChange: true,
  afterDebuggingResolved: true,
  beforeContextSwitch: true,

  // Metric-based triggers
  onPollutionScore: 40,
  onTokenThreshold: 80000
};

function shouldCompactNow(
  health: ContextHealthCheck,
  lastCompacted: Date,
  currentEvent?: string
): boolean {
  const minutesSinceCompact = (Date.now() - lastCompacted.getTime()) / 60000;

  // Time trigger
  if (minutesSinceCompact > compactingSchedule.everyNMinutes) return true;

  // Message count trigger
  if (health.messageCount > compactingSchedule.everyNMessages) return true;

  // Pollution score trigger
  if (health.pollutionScore > compactingSchedule.onPollutionScore) return true;

  // Event triggers
  if (currentEvent === "feature_complete" && compactingSchedule.afterFeatureComplete) {
    return true;
  }

  return false;
}

Context Hygiene Practices

  1. Explicit state updates: After making changes, state current state clearly
  2. Decision logging: Mark decisions as active or superseded
  3. File tracking: Maintain explicit list of active files
  4. Debugging isolation: Contain debugging in sub-conversations when possible
  5. Regular summaries: Periodically summarize completed work
const hygienePrompts = {
  stateUpdate: `
    After this change, update current state:
    - Files modified: [list]
    - Files deleted: [list]
    - Files created: [list]
    - Tests status: [pass/fail]
  `,

  decisionLog: `
    Decision made: [description]
    Supersedes: [previous decision if any]
    Rationale: [brief reason]
    Status: ACTIVE
  `,

  debuggingBoundary: `
    Starting debugging session for: [issue]
    When resolved, summarize findings and clear debugging context.
  `
};

Best Practices

1. Monitor Before Recovery

Don’t recover blindly. Run diagnostics first to understand the pollution type and severity.

2. Preserve Learnings

Every recovery should export learnings. Failed approaches are valuable knowledge that prevents repeat failures.

3. Verify State

Before including any file or code reference in recovery context, verify it currently exists.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course

4. Update External Memory

After recovery, update CLAUDE.md or DIGEST.md with learnings so future sessions benefit.

5. Set Recovery Checkpoints

After successful recovery, note the recovery point so you can return if new pollution accumulates.

Common Pitfalls

Pitfall 1: Recovering Too Aggressively

Problem: Starting fresh when targeted pruning would suffice

Symptom: 2 stale references
Bad: Clean slate (loses 50 messages of useful context)
Good: Prune the 2 stale refs (preserves 48 useful messages)

Pitfall 2: Not Recovering Soon Enough

Problem: Continuing with severe pollution, compounding the problem

Symptom: AI generates code for deleted architecture
Bad: Keep trying to correct AI (pollution gets worse)
Good: Recognize severe pollution, initiate clean slate

Pitfall 3: Losing Learnings in Recovery

Problem: Fresh start without preserving what was learned

Bad: "Let's start over" (will repeat same mistakes)
Good: "Extract learnings, then start over" (builds on failures)

Pitfall 4: Including Unverified State

Problem: Recovery context includes references to non-existent files

Bad: "Active files: auth.ts, user.ts, cache.ts"
     (cache.ts was deleted, now pollution continues)
Good: Verify each file exists before including in recovery context

Related

External References

Topics
Agent WorkflowsClean SlateCompactingContext PollutionContext RecoveryContext RotDebuggingError RecoveryLlm OptimizationSession Management

More Insights

Cover Image for Own Your Control Plane

Own Your Control Plane

If you use someone else’s task manager, you inherit all of their abstractions. In a world where LLMs make software a solved problem, the cost of ownership has flipped.

James Phoenix
James Phoenix
Cover Image for Indexed PRD and Design Doc Strategy

Indexed PRD and Design Doc Strategy

A documentation-driven development pattern where a single `index.md` links all PRDs and design documents, creating navigable context for both humans and AI agents.

James Phoenix
James Phoenix