Progressive Disclosure: Load Context Only When Needed

James Phoenix
James Phoenix

Summary

Progressive disclosure is a design principle where information is organized in layers—starting with minimal metadata, expanding to core instructions, then detailed resources—allowing AI agents to load context only when needed. Like a well-organized manual with a table of contents, chapters, and appendices, this pattern enables scalable agent architectures that stay within context limits while maintaining broad capabilities.

The Problem

AI agents face a fundamental tension: they need access to comprehensive knowledge to handle diverse tasks, but context windows are finite and expensive. Loading everything upfront leads to:

  • Context exhaustion: Running out of tokens before completing complex tasks
  • Cost explosion: Paying for irrelevant context on every request
  • Signal dilution: Important information buried in noise
  • Scalability limits: Can’t add new capabilities without hitting context ceilings

Traditional approaches force a choice: either limit agent capabilities or waste context on rarely-used information.

The Solution

Progressive disclosure organizes context into layers that load on-demand:

Level 1: Metadata (always loaded)
   ↓ triggers
Level 2: Core Instructions (loaded when relevant)
   ↓ references
Level 3+: Supplementary Resources (loaded as needed)

This mirrors how humans use reference materials—you don’t memorize an entire manual, you know what sections exist and consult them when needed.

The Three-Level Architecture

Level 1: Metadata Layer

Minimal information loaded into the system prompt at startup:

# SKILL.md frontmatter
---
name: pdf-manipulation
description: Extract text, fill forms, merge/split PDFs, and convert formats
triggers:
  - "pdf"
  - "form"
  - "document"
---

Purpose: Let the agent recognize when a skill applies without loading full instructions.

Token cost: ~50-100 tokens per skill (can have dozens of skills for <2000 tokens)

Level 2: Core Instructions

Complete skill instructions loaded when the agent determines relevance:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
# PDF Manipulation Skill

## Capabilities
- Extract text from PDFs using `pdf-extract` tool
- Fill form fields using `pdf-form` tool
- Merge multiple PDFs with `pdf-merge`
- Split PDFs by page range

## Usage Patterns

### Text Extraction
1. Identify the PDF file path
2. Call `pdf-extract --input <path> --output <format>`
3. Process the extracted text

### Form Filling
See `forms.md` for detailed form-filling instructions.

Purpose: Provide working knowledge for the task at hand.

Token cost: ~500-2000 tokens per skill (loaded only when needed)

Level 3+: Supplementary Resources

Specialized content loaded only for specific sub-tasks:

# forms.md - Form Filling Reference

## Field Types
- Text fields: Use `set-text` command
- Checkboxes: Use `set-checkbox` with true/false
- Radio buttons: Use `set-radio` with option index
- Dropdowns: Use `set-dropdown` with value

## Common Form Patterns
...detailed reference content...

Purpose: Deep-dive information for edge cases without bloating core instructions.

Token cost: Variable (only loaded when explicitly referenced)

Implementation Patterns

Pattern 1: Skill Directory Structure

skills/
├── pdf/
│   ├── SKILL.md          # Level 1 + 2
│   ├── forms.md          # Level 3
│   └── reference.md      # Level 3
├── git/
│   ├── SKILL.md
│   ├── workflows.md
│   └── troubleshooting.md
└── testing/
    ├── SKILL.md
    └── fixtures.md

Pattern 2: Metadata Registry

// Load only metadata at startup
interface SkillMetadata {
  name: string;
  description: string;
  triggers: string[];
  path: string;
}

const skillRegistry: SkillMetadata[] = [
  {
    name: 'pdf-manipulation',
    description: 'Extract text, fill forms, merge/split PDFs',
    triggers: ['pdf', 'form', 'document'],
    path: 'skills/pdf/SKILL.md'
  },
  // ... more skills
];

// Full skill loaded only when triggered
function loadSkill(name: string): string {
  const skill = skillRegistry.find(s => s.name === name);
  return readFile(skill.path);
}

Pattern 3: Lazy Reference Loading

# In SKILL.md
For advanced form patterns, the agent should read `./forms.md`.
For troubleshooting, consult `./troubleshooting.md`.

The agent discovers and loads these files only when the task requires them.

Real-World Example: CLAUDE.md Hierarchy

Progressive disclosure applies to project documentation too:

project/
├── CLAUDE.md              # Level 1: Project overview, key commands
├── src/
│   ├── CLAUDE.md          # Level 2: Source code conventions
│   └── components/
│       └── CLAUDE.md      # Level 3: Component-specific patterns
└── tests/
    └── CLAUDE.md          # Level 2: Testing conventions

The agent starts with root CLAUDE.md, loads subdirectory context only when working in that area.

Benefits

1. Scalability

Add unlimited skills without context explosion:

10 skills × 50 tokens metadata = 500 tokens always loaded
vs.
10 skills × 1500 tokens full = 15,000 tokens (30x more)

2. Cost Efficiency

Pay only for context you use:

Task: "Extract text from report.pdf"

Without progressive disclosure:
- Load all skills: 15,000 tokens
- Cost: $0.045 input

With progressive disclosure:
- Load metadata: 500 tokens
- Load PDF skill: 1,500 tokens
- Cost: $0.006 input (87% savings)

3. Maintainability

Update skills independently:

# Update only the PDF skill
echo "New form-filling instructions" >> skills/pdf/forms.md

# No changes needed to other skills or core system

4. Unbounded Capability

Agents with filesystem access aren’t limited to context window:

“Agents with filesystem and code execution tools don’t need to load entire skills into their context window—they can read files as needed.”

Anti-Patterns to Avoid

❌ Flat Loading

# DON'T: Load everything upfront
System prompt: [20,000 tokens of instructions for all possible tasks]

❌ Deep Nesting Without Metadata

skills/
└── category/
    └── subcategory/
        └── SKILL.md  # Agent can't discover this without crawling

❌ Monolithic Skills

# DON'T: 5000-token skill file
Everything about PDFs, forms, OCR, conversion, watermarks, encryption...

✅ Instead: Split and Reference

# SKILL.md (500 tokens)
Core PDF operations. See `advanced.md` for OCR and encryption.

Integration with Other Patterns

Progressive Disclosure + Hierarchical Context

Use CLAUDE.md files at each directory level:

Root CLAUDE.md → Subdirectory CLAUDE.md → File-specific comments

Each level adds context only when the agent enters that scope.

Progressive Disclosure + Model Switching

Load lightweight metadata with fast models, full skills with capable models:

// Haiku for skill selection (cheap, fast)
const relevantSkill = await haiku.classify(query, skillRegistry);

// Opus for skill execution (capable, thorough)
const fullSkill = loadSkill(relevantSkill);
await opus.execute(query, fullSkill);

Progressive Disclosure + Prompt Caching

Cache frequently-used skill combinations:

const cachedSkills = {
  'code-review': ['git', 'testing', 'linting'],
  'deployment': ['docker', 'aws', 'monitoring']
};

Measuring Success

Token Efficiency

Metric: Average tokens loaded per task
Target: <3000 tokens for common tasks
Baseline: Full context loading (~15,000 tokens)

Skill Discovery Rate

Metric: % of tasks where correct skill is loaded
Target: >95%
Measure: Log skill loads vs. task completion

Cost per Task

Metric: Average input token cost per task
Target: 70% reduction from baseline

Conclusion

Progressive disclosure transforms context management from a constraint into a feature. By organizing information in layers—metadata, core instructions, supplementary resources—agents can scale to unlimited capabilities while staying within context limits and budget.

Key Takeaways:

  1. Three levels: Metadata (always) → Core (when relevant) → Deep (when needed)
  2. Metadata is cheap: Load skill descriptions upfront (~50 tokens each)
  3. Lazy loading wins: Full skills only when triggered
  4. Filesystem is memory: Agents can read files on-demand
  5. Mirrors human behavior: Like using a manual’s index before reading chapters

Progressive disclosure isn’t just an optimization—it’s the architecture that makes truly capable AI agents possible.

Related Concepts

References

Topics
Agent SkillsClaude CodeContext ManagementInformation ArchitectureProgressive DisclosureScalabilityToken Efficiency

More Insights

Cover Image for Thought Leaders

Thought Leaders

People to follow for compound engineering, context engineering, and AI agent development.

James Phoenix
James Phoenix
Cover Image for Systems Thinking & Observability

Systems Thinking & Observability

Software should be treated as a measurable dynamical system, not as a collection of features.

James Phoenix
James Phoenix