Error Registry for Agents: Own the Primitives

James Phoenix

Summary

Agents repeat errors they have no memory of. ERRORS.md files help, but they are flat, unstructured, and require manual curation. The next step is a proper error registry: a structured, queryable store of every error an agent has encountered, with fingerprinting, deduplication, resolution history, and prevention rules. Think Sentry, but built for agents, where you own every primitive. When agents can read and write to a shared error registry before and after every task, recurring failures drop to near zero.

The Problem

There are three levels of agent error memory today, and most teams are stuck at level one.

Level 0: No memory. The agent hits the same error every session. You fix it manually each time. This is the default state for every LLM interaction.

Level 1: Flat files. You maintain an ERRORS.md that documents common mistakes. This works, but it has structural problems:

No fingerprinting. Two instances of “missing await” in different files are stored as separate entries or, worse, only one gets recorded.
No queryability. You grep for keywords. The agent cannot ask “have I seen an error like this before?” and get a structured answer.
No resolution graph. You know the fix, but not which fixes were tried and failed. Not which other errors the fix introduced.
No automatic ingestion. Every entry requires a human to write it. Errors that happen at 2 AM go unrecorded.
Context window cost. Including the full ERRORS.md in every prompt wastes tokens on irrelevant entries.

Level 2: External tools. You send errors to Sentry or Datadog. These tools are designed for human developers scrolling dashboards. They have no API surface optimized for agent consumption. The error data exists, but agents cannot act on it.

The gap between Level 1 and what agents actually need is where the error registry sits.

The Solution

Build a structured error registry that agents can read from, write to, and query before every task. Own every primitive:

┌─────────────────────────────────────────────────────────┐
│                    ERROR REGISTRY                        │
├─────────────────────────────────────────────────────────┤
│ Fingerprinting  → Deduplicate errors by structure       │
│ Classification  → Categorize by root cause type         │
│ Resolution Log  → Track what fixes work (and don't)     │
│ Prevention Rules→ Machine-readable prevention policies   │
│ Agent API       → Query interface for agent consumption  │
│ Auto-Ingestion  → Capture errors without human effort    │
└─────────────────────────────────────────────────────────┘

The registry is not a dashboard. It is an agent-facing knowledge store where every error becomes a permanent lesson.

Why Own the Primitives

Sentry is excellent for human operators. But agents need something different:

Capability	Sentry (Human-First)	Error Registry (Agent-First)
Error format	Stack traces, breadcrumbs	Structured JSON with code patterns
Resolution tracking	“Resolve” button	Resolution log with what was tried
Prevention	Alert thresholds	Machine-readable prevention rules
Query interface	Web dashboard	Programmatic API or local file query
Context delivery	Full error page	Minimal, relevant context snippet
Ingestion	SDK auto-capture	Agent tool calls + CI pipeline hooks
Deduplication	Stack trace hashing	AST-level structural fingerprinting

When you own the primitives, you control:

What gets stored. Not just stack traces, but the code pattern that caused the error, the bad output the agent generated, and the correct output.
How it is queried. Agents get structured responses, not HTML pages. Token-efficient, relevant context.
How prevention works. Prevention rules are code, not alerts. They feed directly into CLAUDE.md, hooks, or CI checks.
How resolution evolves. A fix that worked once is a candidate. A fix that worked ten times is a rule. A fix that was tried and failed is marked as such.

Architecture

Core Schema

Every error entry has a consistent structure:

interface ErrorEntry {
  // Identity
  id: string;
  fingerprint: string;           // Structural hash for deduplication
  title: string;                 // Human-readable summary

  // Classification
  category: "context" | "model" | "rules" | "testing" | "quality-gate";
  severity: "critical" | "high" | "medium" | "low";
  tags: string[];                // e.g., ["async", "database", "zod"]

  // Evidence
  symptom: string;               // What the developer or agent observes
  badPattern: CodePattern;       // The code that causes the error
  correctPattern: CodePattern;   // The code that fixes it
  rootCause: string;             // Why this happens
  relatedFiles: string[];        // Where this has occurred

  // Tracking
  occurrences: Occurrence[];     // Every time this error appeared
  firstSeen: Date;
  lastSeen: Date;
  frequency: number;             // Total count

  // Resolution
  resolutions: Resolution[];     // Fixes tried, with success/failure
  currentFix: Resolution | null; // The active fix
  preventionRules: PreventionRule[];

  // Status
  status: "active" | "mitigated" | "prevented" | "archived";
}

interface CodePattern {
  language: string;
  code: string;
  description: string;
}

interface Occurrence {
  timestamp: Date;
  file: string;
  context: string;               // What task was being performed
  agentSession: string;          // Which session produced this
}

interface Resolution {
  id: string;
  description: string;
  appliedAt: Date;
  success: boolean;
  sideEffects: string[];         // Other errors this fix introduced
}

interface PreventionRule {
  type: "lint-rule" | "hook" | "test" | "claude-md" | "ci-check";
  description: string;
  implementation: string;        // The actual rule or config
  addedAt: Date;
  effectiveness: number;         // 0-1, tracked over time
}

Fingerprinting: The Key Primitive

The most important primitive is fingerprinting. Two errors are the same error if their structural pattern matches, even if they occur in different files, at different times, by different agents.

function fingerprintError(error: RawError): string {
  // Level 1: Exact match on error message template
  const messageTemplate = error.message
    .replace(/['"][^'"]*['"]/g, "'<STRING>'")  // Normalize strings
    .replace(/\d+/g, "<NUM>")                    // Normalize numbers
    .replace(/\/[\w./]+/g, "<PATH>");             // Normalize paths

  // Level 2: Code pattern hash
  const patternHash = hashCodePattern(error.codeSnippet);

  // Level 3: Error category + affected construct
  const structural = `${error.errorType}:${error.affectedConstruct}`;

  return hash(`${messageTemplate}|${patternHash}|${structural}`);
}

// Example: These two errors produce the same fingerprint
// Error A: "Cannot read property 'email' of null" in src/api/users.ts
// Error B: "Cannot read property 'name' of null" in src/api/posts.ts
// Both are: null-property-access on a database query result

Fingerprinting turns a stream of individual incidents into a registry of error classes. Each class accumulates evidence, resolutions, and prevention rules over time.

Storage Options

The registry can live at different levels of sophistication:

Option A: Structured JSON file (simplest)

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

★ 4.5/5 rating

306,000+ learners

View Course

project/
└── .errors/
    ├── registry.json       # All error entries
    ├── index.json          # Fingerprint → entry ID lookup
    └── rules/
        ├── lint-rules.json # Generated lint rules
        └── hooks.json      # Generated hook configs

Good for single-developer projects. Version-controlled, human-readable, no infrastructure.

Option B: SQLite database

CREATE TABLE errors (
  id TEXT PRIMARY KEY,
  fingerprint TEXT UNIQUE,
  title TEXT,
  category TEXT,
  severity TEXT,
  bad_pattern TEXT,
  correct_pattern TEXT,
  root_cause TEXT,
  frequency INTEGER DEFAULT 1,
  first_seen TEXT,
  last_seen TEXT,
  status TEXT DEFAULT 'active'
);

CREATE TABLE occurrences (
  id TEXT PRIMARY KEY,
  error_id TEXT REFERENCES errors(id),
  timestamp TEXT,
  file TEXT,
  context TEXT,
  agent_session TEXT
);

CREATE TABLE resolutions (
  id TEXT PRIMARY KEY,
  error_id TEXT REFERENCES errors(id),
  description TEXT,
  applied_at TEXT,
  success BOOLEAN,
  side_effects TEXT
);

CREATE INDEX idx_fingerprint ON errors(fingerprint);
CREATE INDEX idx_category ON errors(category);
CREATE INDEX idx_status ON errors(status);

Good for teams. Queryable, concurrent-safe, still local.

Option C: API service

For organizations running multiple agents across multiple repos. A central registry that all agents query. This is the “build your own Sentry” path, but the API is designed for agents, not humans.

Agent Query Interface

The critical design choice: how do agents consume the registry? The answer is a tool or MCP server that returns token-efficient, relevant results.

// Agent tool: query_error_registry
interface QueryErrorRegistry {
  // Search by similarity to current error
  query?: string;

  // Filter by classification
  category?: ErrorCategory;
  tags?: string[];
  severity?: Severity;

  // Filter by status
  status?: "active" | "mitigated" | "prevented";

  // Limit results for context efficiency
  limit?: number;
}

// Example: Agent encounters a new error
const results = await queryErrorRegistry({
  query: "Cannot read property of null after database query",
  category: "testing",
  limit: 3,
});

// Returns:
// [
//   {
//     title: "Missing null checks after database queries",
//     frequency: 15,
//     status: "mitigated",
//     correctPattern: "if (!result) { return error('Not found'); }",
//     preventionRules: [
//       { type: "lint-rule", rule: "no-unchecked-db-result" }
//     ]
//   }
// ]

The agent now knows: this error has happened 15 times before, there is a known fix, and there is a lint rule that should catch it. It can apply the fix immediately and verify the lint rule is active.

Implementation: From ERRORS.md to Registry

Step 1: Auto-Ingest from Agent Sessions

Instead of manually documenting errors, capture them automatically:

// Hook into agent error events
async function onAgentError(error: AgentError): Promise<void> {
  const fingerprint = fingerprintError(error);
  const existing = await registry.getByFingerprint(fingerprint);

  if (existing) {
    // Known error: increment occurrence
    await registry.addOccurrence(existing.id, {
      timestamp: new Date(),
      file: error.file,
      context: error.taskDescription,
      agentSession: error.sessionId,
    });

    // Update last seen and frequency
    await registry.update(existing.id, {
      lastSeen: new Date(),
      frequency: existing.frequency + 1,
    });
  } else {
    // New error: create entry
    await registry.create({
      fingerprint,
      title: classifyErrorTitle(error),
      category: diagnoseRootCause(error),
      severity: assessSeverity(error),
      symptom: error.message,
      badPattern: extractCodePattern(error),
      firstSeen: new Date(),
      lastSeen: new Date(),
      frequency: 1,
      status: "active",
    });
  }
}

Step 2: Pre-Task Context Injection

Before starting any task, the agent queries the registry for relevant errors:

async function buildTaskContext(task: TaskDescription): Promise<string> {
  // Query registry for errors related to this task
  const relevantErrors = await registry.query({
    tags: extractTags(task),
    status: "active",
    limit: 5,
  });

  if (relevantErrors.length === 0) return "";

  // Format as concise context block
  return `
## Known Error Patterns (from error registry)

${relevantErrors.map(e => `
### ${e.title} (${e.frequency} occurrences)
- Symptom: ${e.symptom}
- Fix: ${e.correctPattern.description}
- Prevention: ${e.preventionRules.map(r => r.description).join(", ")}
`).join("\n")}

Avoid these patterns when implementing this task.
`;
}

This replaces the “include all of ERRORS.md” approach with targeted, relevant context. Token cost drops from thousands of tokens to a few hundred.

Step 3: Post-Resolution Learning

When an agent fixes an error, record the resolution:

async function onErrorResolved(
  errorId: string,
  resolution: ResolutionAttempt
): Promise<void> {
  await registry.addResolution(errorId, {
    description: resolution.description,
    appliedAt: new Date(),
    success: resolution.testsPass,
    sideEffects: resolution.newErrors,
  });

  // If this is the third successful resolution with the same pattern,
  // promote to prevention rule
  const entry = await registry.get(errorId);
  const successfulFixes = entry.resolutions.filter(r => r.success);

  if (successfulFixes.length >= 3 && !entry.preventionRules.length) {
    await suggestPreventionRule(entry);
  }
}

Step 4: Automatic Prevention Promotion

When an error reaches a frequency threshold, automatically generate prevention:

async function suggestPreventionRule(entry: ErrorEntry): Promise<void> {
  // Generate lint rule from bad pattern
  if (entry.badPattern.language === "typescript") {
    const lintRule = await generateLintRule(entry.badPattern, entry.correctPattern);

    await registry.addPreventionRule(entry.id, {
      type: "lint-rule",
      description: `Prevent: ${entry.title}`,
      implementation: lintRule,
      addedAt: new Date(),
      effectiveness: 0, // Track over time
    });
  }

  // Add to CLAUDE.md
  const claudeRule = formatForClaudeMd(entry);
  await registry.addPreventionRule(entry.id, {
    type: "claude-md",
    description: `CLAUDE.md rule: ${entry.title}`,
    implementation: claudeRule,
    addedAt: new Date(),
    effectiveness: 0,
  });

  // Update status
  await registry.update(entry.id, { status: "mitigated" });
}

The Compound Effect

The registry creates a flywheel:

Agent encounters error
    → Registry captures it (auto-ingest)
    → Agent queries registry next time (pre-task context)
    → Error is avoided (known pattern)
    → If error recurs, resolution is tracked (post-resolution learning)
    → At threshold, prevention rule is generated (automatic promotion)
    → Error class is eliminated (status: prevented)

Each loop makes every future agent session better. After weeks of operation:

Week 1:  Agent encounters 20 errors, registry has 20 entries
Week 2:  Agent encounters 15 errors, 5 were prevented by registry
Week 4:  Agent encounters 8 errors, 12 prevented, 3 auto-promoted to lint rules
Week 8:  Agent encounters 3 errors, most are genuinely novel
Week 12: Registry has 80+ entries, 60% have prevention rules, new error rate is minimal

This is the same compound curve as the ERRORS.md pattern, but automated. No human has to remember to document errors. No human has to include the right section in the prompt. The agent does it all.

MCP Server Implementation

The cleanest way to expose the registry to agents is as an MCP server:

// error-registry MCP server
const tools = [
  {
    name: "query_errors",
    description: "Search the error registry for known error patterns",
    parameters: {
      query: { type: "string", description: "Error description or symptom" },
      tags: { type: "array", items: { type: "string" } },
      limit: { type: "number", default: 5 },
    },
  },
  {
    name: "report_error",
    description: "Report a new error or occurrence to the registry",
    parameters: {
      symptom: { type: "string" },
      badCode: { type: "string" },
      file: { type: "string" },
      context: { type: "string" },
    },
  },
  {
    name: "report_resolution",
    description: "Record that an error was fixed",
    parameters: {
      errorId: { type: "string" },
      fixDescription: { type: "string" },
      fixCode: { type: "string" },
      testsPass: { type: "boolean" },
    },
  },
  {
    name: "get_prevention_rules",
    description: "Get active prevention rules for a set of tags",
    parameters: {
      tags: { type: "array", items: { type: "string" } },
    },
  },
];

Now any agent with MCP access can read from and write to the registry without custom integration code.

Comparison: ERRORS.md vs Error Registry

Dimension	ERRORS.md	Error Registry
Ingestion	Manual, human writes each entry	Automatic, agent reports errors
Deduplication	Manual, human checks for duplicates	Fingerprinting, automatic
Query	grep/search, full file in context	Structured query, relevant results only
Resolution tracking	Free text, no history	Structured log, success/failure tracked
Prevention	Manual, human creates rules	Auto-promotion at frequency threshold
Context cost	Full file (thousands of tokens)	Relevant entries only (hundreds of tokens)
Team scaling	One file, merge conflicts	Database, concurrent-safe
Agent interaction	Passive (read-only)	Active (read + write)

ERRORS.md is Level 1. The error registry is Level 3. Both are better than Level 0.

When to Build This

Start with ERRORS.md. It costs nothing and captures 80% of the value.

Graduate to a registry when:

You have 20+ documented errors and searching ERRORS.md is slow
Multiple agents or team members are encountering the same errors
You want automatic ingestion (errors captured without human effort)
You want prevention rules generated from frequency data
You are running agents in CI/CD and need programmatic error tracking
Your ERRORS.md is consuming too many tokens when included in context

Do not build this when:

You are a solo developer with a small project
You have fewer than 10 documented error patterns
Your agent interactions are infrequent

Best Practices

1. Fingerprint at the Pattern Level, Not the Instance Level

Bad:  Hash the full error message (too specific, misses duplicates)
Good: Hash the structural pattern (catches all instances of the same class)

2. Keep Prevention Rules as Code

Prevention rules should be executable, not advisory:

// Bad: "Remember to check for null after database queries"
// Good:
{
  type: "lint-rule",
  implementation: "@typescript-eslint/no-floating-promises: error"
}

3. Track Resolution Effectiveness

Not every fix works. Track which resolutions succeed and which fail:

// After applying a resolution, verify it worked
const resolution = await registry.getResolution(resolutionId);
if (errorRecurred(resolution.errorId, resolution.appliedAt)) {
  await registry.updateResolution(resolutionId, { success: false });
}

4. Prune Stale Entries

Errors that have not occurred in 6+ months with active prevention rules can be archived:

async function pruneStaleEntries(): Promise<void> {
  const stale = await registry.query({
    status: "mitigated",
    lastSeenBefore: sixMonthsAgo(),
  });

  for (const entry of stale) {
    if (entry.preventionRules.length > 0) {
      await registry.update(entry.id, { status: "archived" });
    }
  }
}

5. Separate Registry per Project, Shared Patterns Across Projects

Each project has its own registry. But some error patterns (missing await, null checks, type mismatches) are universal. Extract these into a shared “base registry” that seeds new projects.

~/.error-registry/         # Shared base patterns
project-a/.errors/         # Project-specific errors
project-b/.errors/         # Project-specific errors

Common Pitfalls

Pitfall 1: Over-Engineering the Storage Layer

Start with a JSON file. Move to SQLite when querying gets slow. Move to an API when multiple services need access. Do not start with Postgres.

Pitfall 2: Capturing Too Much Context

Each error entry should be minimal. The bad pattern, the fix, and the prevention rule. Not the full stack trace, not the entire file contents, not the conversation history. Token efficiency matters.

Pitfall 3: Never Reviewing Prevention Effectiveness

A prevention rule that is 50% effective is worse than no rule (it creates false confidence). Track effectiveness and remove rules that do not work.

Pitfall 4: Manual-Only Ingestion

The whole point of the registry over ERRORS.md is automatic capture. If agents cannot write to the registry, you are just building a fancier ERRORS.md.

Error Messages as Training – The ERRORS.md pattern this builds on
Five-Point Error Diagnostic Framework – Classification system for the category field
Agent Memory Patterns – Broader memory architecture this fits into
Learning Loops – The encode-to-prevent philosophy
Institutional Memory via Learning Files – Complementary pattern for successes
MCP Server for Project Context – Implementation pattern for the MCP interface
Custom ESLint Rules for Determinism – Auto-generated lint rules from registry
Closed-Loop Telemetry Optimization – Registry as part of the feedback loop
Prevention Protocol – Systematic prevention that the registry automates

References

Sentry Error Tracking – The human-first error tracking tool this draws inspiration from
Error Fingerprinting (Sentry Docs) – How Sentry groups errors, adapted here for agent consumption
Model Context Protocol – The protocol for exposing the registry to agents