Progressive Disclosure: Load Context Only When Needed

James Phoenix

Summary

Progressive disclosure is a design principle where information is organized in layers—starting with minimal metadata, expanding to core instructions, then detailed resources—allowing AI agents to load context only when needed. Like a well-organized manual with a table of contents, chapters, and appendices, this pattern enables scalable agent architectures that stay within context limits while maintaining broad capabilities.

The Problem

AI agents face a fundamental tension: they need access to comprehensive knowledge to handle diverse tasks, but context windows are finite and expensive. Loading everything upfront leads to:

Context exhaustion: Running out of tokens before completing complex tasks
Cost explosion: Paying for irrelevant context on every request
Signal dilution: Important information buried in noise
Scalability limits: Can’t add new capabilities without hitting context ceilings

Traditional approaches force a choice: either limit agent capabilities or waste context on rarely-used information.

The Solution

Progressive disclosure organizes context into layers that load on-demand:

Level 1: Metadata (always loaded)
   ↓ triggers
Level 2: Core Instructions (loaded when relevant)
   ↓ references
Level 3+: Supplementary Resources (loaded as needed)

This mirrors how humans use reference materials—you don’t memorize an entire manual, you know what sections exist and consult them when needed.

The Three-Level Architecture

Level 1: Metadata Layer

Minimal information loaded into the system prompt at startup:

# SKILL.md frontmatter
---
name: pdf-manipulation
description: Extract text, fill forms, merge/split PDFs, and convert formats
triggers:
  - "pdf"
  - "form"
  - "document"
---

Purpose: Let the agent recognize when a skill applies without loading full instructions.

Token cost: ~50-100 tokens per skill (can have dozens of skills for <2000 tokens)

Level 2: Core Instructions

Complete skill instructions loaded when the agent determines relevance:

# PDF Manipulation Skill

## Capabilities
- Extract text from PDFs using `pdf-extract` tool
- Fill form fields using `pdf-form` tool
- Merge multiple PDFs with `pdf-merge`
- Split PDFs by page range

## Usage Patterns

### Text Extraction
1. Identify the PDF file path
2. Call `pdf-extract --input <path> --output <format>`
3. Process the extracted text

### Form Filling
See `forms.md` for detailed form-filling instructions.

Purpose: Provide working knowledge for the task at hand.

Token cost: ~500-2000 tokens per skill (loaded only when needed)

Level 3+: Supplementary Resources

Specialized content loaded only for specific sub-tasks:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

★ 4.5/5 rating

306,000+ learners

View Course

# forms.md - Form Filling Reference

## Field Types
- Text fields: Use `set-text` command
- Checkboxes: Use `set-checkbox` with true/false
- Radio buttons: Use `set-radio` with option index
- Dropdowns: Use `set-dropdown` with value

## Common Form Patterns
...detailed reference content...

Purpose: Deep-dive information for edge cases without bloating core instructions.

Token cost: Variable (only loaded when explicitly referenced)

Implementation Patterns

Pattern 1: Skill Directory Structure

skills/
├── pdf/
│   ├── SKILL.md          # Level 1 + 2
│   ├── forms.md          # Level 3
│   └── reference.md      # Level 3
├── git/
│   ├── SKILL.md
│   ├── workflows.md
│   └── troubleshooting.md
└── testing/
    ├── SKILL.md
    └── fixtures.md

Pattern 2: Metadata Registry

// Load only metadata at startup
interface SkillMetadata {
  name: string;
  description: string;
  triggers: string[];
  path: string;
}

const skillRegistry: SkillMetadata[] = [
  {
    name: 'pdf-manipulation',
    description: 'Extract text, fill forms, merge/split PDFs',
    triggers: ['pdf', 'form', 'document'],
    path: 'skills/pdf/SKILL.md'
  },
  // ... more skills
];

// Full skill loaded only when triggered
function loadSkill(name: string): string {
  const skill = skillRegistry.find(s => s.name === name);
  return readFile(skill.path);
}

Pattern 3: Lazy Reference Loading

# In SKILL.md
For advanced form patterns, the agent should read `./forms.md`.
For troubleshooting, consult `./troubleshooting.md`.

The agent discovers and loads these files only when the task requires them.

Real-World Example: CLAUDE.md Hierarchy

Progressive disclosure applies to project documentation too:

project/
├── CLAUDE.md              # Level 1: Project overview, key commands
├── src/
│   ├── CLAUDE.md          # Level 2: Source code conventions
│   └── components/
│       └── CLAUDE.md      # Level 3: Component-specific patterns
└── tests/
    └── CLAUDE.md          # Level 2: Testing conventions

The agent starts with root CLAUDE.md, loads subdirectory context only when working in that area.

Benefits

1. Scalability

Add unlimited skills without context explosion:

10 skills × 50 tokens metadata = 500 tokens always loaded
vs.
10 skills × 1500 tokens full = 15,000 tokens (30x more)

2. Cost Efficiency

Pay only for context you use:

Task: "Extract text from report.pdf"

Without progressive disclosure:
- Load all skills: 15,000 tokens
- Cost: $0.045 input

With progressive disclosure:
- Load metadata: 500 tokens
- Load PDF skill: 1,500 tokens
- Cost: $0.006 input (87% savings)

3. Maintainability

Update skills independently:

# Update only the PDF skill
echo "New form-filling instructions" >> skills/pdf/forms.md

# No changes needed to other skills or core system

4. Unbounded Capability

Agents with filesystem access aren’t limited to context window:

“Agents with filesystem and code execution tools don’t need to load entire skills into their context window—they can read files as needed.”

Anti-Patterns to Avoid

❌ Flat Loading

# DON'T: Load everything upfront
System prompt: [20,000 tokens of instructions for all possible tasks]

❌ Deep Nesting Without Metadata

skills/
└── category/
    └── subcategory/
        └── SKILL.md  # Agent can't discover this without crawling

❌ Monolithic Skills

# DON'T: 5000-token skill file
Everything about PDFs, forms, OCR, conversion, watermarks, encryption...

✅ Instead: Split and Reference

# SKILL.md (500 tokens)
Core PDF operations. See `advanced.md` for OCR and encryption.

Integration with Other Patterns

Progressive Disclosure + Hierarchical Context

Use CLAUDE.md files at each directory level:

Root CLAUDE.md → Subdirectory CLAUDE.md → File-specific comments

Each level adds context only when the agent enters that scope.

Progressive Disclosure + Model Switching

Load lightweight metadata with fast models, full skills with capable models:

// Haiku for skill selection (cheap, fast)
const relevantSkill = await haiku.classify(query, skillRegistry);

// Opus for skill execution (capable, thorough)
const fullSkill = loadSkill(relevantSkill);
await opus.execute(query, fullSkill);

Progressive Disclosure + Prompt Caching

Cache frequently-used skill combinations:

const cachedSkills = {
  'code-review': ['git', 'testing', 'linting'],
  'deployment': ['docker', 'aws', 'monitoring']
};

Measuring Success

Token Efficiency

Metric: Average tokens loaded per task
Target: <3000 tokens for common tasks
Baseline: Full context loading (~15,000 tokens)

Skill Discovery Rate

Metric: % of tasks where correct skill is loaded
Target: >95%
Measure: Log skill loads vs. task completion

Cost per Task

Metric: Average input token cost per task
Target: 70% reduction from baseline

Conclusion

Progressive disclosure transforms context management from a constraint into a feature. By organizing information in layers—metadata, core instructions, supplementary resources—agents can scale to unlimited capabilities while staying within context limits and budget.

Key Takeaways:

Three levels: Metadata (always) → Core (when relevant) → Deep (when needed)
Metadata is cheap: Load skill descriptions upfront (~50 tokens each)
Lazy loading wins: Full skills only when triggered
Filesystem is memory: Agents can read files on-demand
Mirrors human behavior: Like using a manual’s index before reading chapters

Progressive disclosure isn’t just an optimization—it’s the architecture that makes truly capable AI agents possible.

Related Concepts

Hierarchical Context Patterns – CLAUDE.md files at every level
Context Debugging Framework – Systematic approach when context issues arise
Context Rot Auto-Compacting – Managing context over long sessions
MCP Server for Project Context – Dynamic context loading
Clean Slate Trajectory Recovery – Reset when progressive loading isn’t enough
Sliding Window History – Bounded state management for context retention
Model Switching Strategy – Right model for each phase
Prompt Caching Strategy – Cache frequently-used context
Layered Prompts Architecture – Onion architecture for prompts

References

Equipping Agents with Skills – Anthropic’s engineering blog on Agent Skills and progressive disclosure
Claude Code Best Practices – Official documentation on context management