Summary
AI coding agents function as recursive function generators following the pattern: AI_Agent = fn(Verify(Generate(Retrieve()))). They retrieve context, generate code probabilistically, verify through quality gates, and recurse until verification passes. Understanding this model helps design better workflows, improve retrieval strategies, and construct effective quality gates that reduce entropy in generated outputs.
The Core Model
AI coding agents—whether Claude Code, Cursor, Aider, or custom implementations—all follow the same fundamental pattern:
AI_Agent = fn(Verify(Generate(Retrieve())))
Where:
- Retrieve: Context gathering (semantic search, file reads, grep, AST analysis)
- Generate: Code production via probabilistic language models
- Verify: Quality gates (tests, type checkers, linters, CI/CD)
- fn: Recursive function that loops until Verify passes
This isn’t just a metaphor—it’s a precise description of how these systems operate. Understanding this model transforms how you work with AI coding agents.
The Three Phases
Phase 1: Retrieve
What happens: The AI gathers relevant context for the task.
Retrieval mechanisms:
// Semantic search
const relevantFiles = await semanticSearch({
query: "user authentication patterns",
embeddings: codebaseEmbeddings,
topK: 10
});
// File system traversal
const contextFiles = [
await readFile('CLAUDE.md'),
await readFile('packages/auth/CLAUDE.md'),
await readFile('schemas/user.schema.json')
];
// Grep for patterns
const examples = await grep({
pattern: 'async.*authenticate',
path: 'packages/auth/src/**/*.ts'
});
// AST analysis
const imports = await parseAST('src/auth/service.ts')
.then(ast => extractImports(ast));
Quality of retrieval determines generation accuracy:
High-quality retrieval (relevant context):
→ LLM has correct patterns to follow
→ Generation matches codebase conventions
→ Fewer verification failures
Low-quality retrieval (irrelevant context):
→ LLM hallucinates patterns
→ Generation deviates from conventions
→ More verification failures → more recursion
Phase 2: Generate
What happens: The AI produces code based on retrieved context and probabilistic models.
Generation is probabilistic:
# Simplified model of LLM generation
def generate(context: str, task: str) -> str:
# LLM samples from probability distribution
# P(next_token | previous_tokens + context)
tokens = []
for position in range(max_length):
# Get probability distribution over vocabulary
probs = model.forward(tokens + context)
# Sample next token (temperature controls randomness)
next_token = sample(probs, temperature=0.7)
tokens.append(next_token)
if next_token == END_TOKEN:
break
return decode(tokens)
Key insight: Generation is sampling from a probability distribution. The distribution is shaped by:
- Context quality (from Retrieve phase)
- Model capabilities (base model knowledge)
- Temperature (randomness vs. determinism)
- Constraints (types, tests, documentation)
Phase 3: Verify
What happens: Quality gates check if generated code is valid.
Verification is deterministic:
# Type checking (deterministic)
tsc --noEmit
# Exit code: 0 = pass, 1 = fail
# Linting (deterministic)
eslint src/**/*.ts
# Exit code: 0 = pass, 1 = fail
# Testing (mostly deterministic)
vitest run
# Exit code: 0 = pass, 1 = fail
# Integration checks
curl http://localhost:3000/health
# Response: 200 = pass, 4xx/5xx = fail
Each gate is a binary filter:
Generated code → [Type Checker] → Pass/Fail
→ [Linter] → Pass/Fail
→ [Tests] → Pass/Fail
→ [CI/CD] → Pass/Fail
All gates must pass for verification to succeed.
The Recursive Loop
What happens: If verification fails, the agent loops back to Retrieve or Generate.
def ai_agent(task: str, max_iterations: int = 10) -> Code:
for iteration in range(max_iterations):
# Phase 1: Retrieve
context = retrieve(task, previous_failures)
# Phase 2: Generate
code = generate(context, task)
# Phase 3: Verify
verification_result = verify(code)
if verification_result.all_passed:
return code # Success!
# Failed verification → add errors to context for next iteration
previous_failures.append(verification_result.errors)
# Loop continues...
raise MaxIterationsExceeded("Could not generate valid code")
Recursive depth depends on:
- Gate strictness: Stricter gates → more iterations needed
- Context quality: Better context → fewer iterations
- Task complexity: Complex tasks → more iterations
- Model capability: Better models → fewer iterations
How Constraints Reduce Entropy
Each constraint you add narrows the probability distribution during generation, reducing entropy.
Without Constraints (High Entropy)
// Prompt: "Write a function to authenticate users"
// High entropy: many equally-probable outputs
// Option 1
function auth(u, p) { return u === 'admin' && p === 'password'; }
// Option 2
async function authenticate(email: string, password: string) {
const user = await db.users.findOne({ email });
return bcrypt.compare(password, user.hash);
}
// Option 3
class Authenticator {
verify(credentials: Credentials): boolean { /* ... */ }
}
// ... 1000+ more possibilities
With Constraints (Low Entropy)
// Types constrain return value and parameters
interface AuthResult {
success: boolean;
user?: User;
error?: string;
}
function authenticate(
email: string,
password: string
): Promise<AuthResult>;
// Tests constrain behavior
describe('authenticate', () => {
it('returns success=true for valid credentials', async () => {
const result = await authenticate('[email protected]', 'correct');
expect(result.success).toBe(true);
expect(result.user).toBeDefined();
});
});
// Context constrains patterns (CLAUDE.md)
// "NEVER throw exceptions in auth functions"
// "ALWAYS return AuthResult"
// Low entropy: only ~10 valid implementations that satisfy all constraints
Mathematics: Each constraint multiplies the reduction in valid outputs:
Initial state space: 10,000 possible implementations
After types: 10,000 × 0.4 = 4,000 implementations
After tests: 4,000 × 0.3 = 1,200 implementations
After context: 1,200 × 0.2 = 240 implementations
After linting: 240 × 0.1 = 24 implementations
Final: 24 valid implementations (99.76% reduction in entropy)
Why Quality Gates Work
Quality gates are entropy filters that eliminate invalid states from the output space.
The Filtering Process
┌─────────────────────────────────────┐
│ All Syntactically Valid Programs │ ← High Entropy
│ (millions) │ H ≈ 20 bits
└──────────────┬──────────────────────┘
│
▼
┌──────────────┐
│ Type Checker │ ← Filter 1
└──────┬───────┘ Reduces to H ≈ 16 bits
│
▼
┌─────────────────────────────────────┐
│ Type-Safe Programs │ ← Medium Entropy
│ (thousands) │
└──────────────┬──────────────────────┘
│
▼
┌──────────────┐
│ Linter │ ← Filter 2
└──────┬───────┘ Reduces to H ≈ 12 bits
│
▼
┌─────────────────────────────────────┐
│ Type-Safe, Clean Programs │ ← Lower Entropy
│ (hundreds) │
└──────────────┬──────────────────────┘
│
▼
┌──────────────┐
│ Tests │ ← Filter 3
└──────┬───────┘ Reduces to H ≈ 5 bits
│
▼
┌─────────────────────────────────────┐
│ Type-Safe, Clean, Correct Programs │ ← Low Entropy
│ (tens) │ H ≈ 5 bits
└─────────────────────────────────────┘
Each gate halves (approximately) the valid output space, reducing entropy exponentially.
Why This Model Matters
Implication 1: More gates → Lower entropy → More predictable outputs
Implication 2: Better retrieval → Better initial distribution → Fewer iterations
Implication 3: Each verification failure provides information for the next iteration
Practical Applications
Application 1: Designing Optimal Workflows
Understanding the recursive model helps you design better workflows.
Optimize each phase:
// Phase 1: Improve Retrieve
// - Use hierarchical CLAUDE.md files (reduce noise)
// - Implement semantic search (find relevant patterns)
// - Provide working examples (show correct patterns)
// Phase 2: Constrain Generate
// - Add type definitions (narrow output space)
// - Provide context (shape probability distribution)
// - Use temperature=0 for determinism
// Phase 3: Strengthen Verify
// - Add integration tests (catch real errors)
// - Custom linting rules (enforce patterns)
// - Pre-commit hooks (prevent bad commits)
Result: Fewer iterations, faster convergence, higher quality.
Application 2: Debugging LLM Behavior
When the LLM produces unexpected outputs, trace through the model:
Problem: LLM generates code that doesn’t match codebase patterns
Diagnosis:
1. Check Retrieve phase:
- Is relevant context being loaded?
- Are CLAUDE.md files up-to-date?
- Are examples present?
2. Check Generate phase:
- Is the prompt clear?
- Are constraints explicit?
- Is temperature too high (too random)?
3. Check Verify phase:
- Are quality gates running?
- Are they catching the issues?
- Are error messages informative?
Application 3: Optimizing Context Loading
Insight: Retrieval quality directly affects generation accuracy.
Strategy: Prioritize high-signal context.
// High-signal context (load first)
const criticalContext = [
'CLAUDE.md', // Architectural patterns
'schemas/entity.schema.json', // Type definitions
'examples/working-code.ts', // Concrete examples
'tests/behavior.test.ts' // Expected behavior
];
// Medium-signal context (load if space permits)
const supplementaryContext = [
'README.md', // General overview
'docs/architecture.md', // Design decisions
];
// Low-signal context (skip if context window limited)
const optionalContext = [
'CHANGELOG.md', // Historical changes
'docs/api.md' // Generic API docs
];
Result: Better retrieval → Better generation → Fewer iterations
Application 4: Measuring Agent Performance
The recursive model provides metrics to track:
interface AgentMetrics {
// Retrieval metrics
contextRelevance: number; // 0-1, how relevant was retrieved context?
contextSize: number; // Tokens loaded
retrievalTime: number; // ms
// Generation metrics
outputSize: number; // Tokens generated
generationTime: number; // ms
temperature: number; // Randomness setting
// Verification metrics
gatesPassed: number; // How many gates passed?
gatesFailed: number; // How many failed?
verificationTime: number; // ms
// Recursive metrics
iterations: number; // How many loops?
convergenceTime: number; // Total time to pass all gates
finalEntropy: number; // Estimated output entropy
}
Track these over time:
Week 1: Average 8 iterations, 45% first-pass success
Week 2: Average 5 iterations, 65% first-pass success (improved context)
Week 3: Average 3 iterations, 80% first-pass success (added quality gates)
Week 4: Average 2 iterations, 90% first-pass success (optimized retrieval)
Real-World Example: Implementing a Feature
Let’s trace through the recursive function model for a real task.
Task: “Add user email verification to authentication flow”
Iteration 1
Retrieve:
// Context loaded:
- packages/auth/CLAUDE.md (authentication patterns)
- schemas/user.schema.json (User type definition)
- packages/auth/src/authenticate.ts (existing auth code)
- tests/auth/authenticate.test.ts (existing tests)
Generate:
// LLM generates:
async function verifyEmail(token: string): Promise<boolean> {
const user = await db.users.findOne({ verificationToken: token });
if (!user) return false;
user.emailVerified = true;
await user.save();
return true;
}
Verify:
✅ Type check: PASS
❌ Linter: FAIL - "Direct database access not allowed, use repository pattern"
❌ Tests: FAIL - No tests written
Result: Verification failed → Recurse
Iteration 2
Retrieve (augmented with previous errors):
// Previous context +
- packages/auth/src/user-repository.ts (repository pattern)
- Linter error: "Use repository pattern"
- Test failure: "No tests"
Generate:
// LLM generates (now following repository pattern):
async function verifyEmail(
token: string,
userRepo: UserRepository
): Promise<VerifyEmailResult> {
const user = await userRepo.findByVerificationToken(token);
if (!user) {
return { success: false, error: 'Invalid token' };
}
await userRepo.update(user.id, { emailVerified: true });
return { success: true };
}
// Tests added:
describe('verifyEmail', () => {
it('marks email as verified for valid token', async () => {
const result = await verifyEmail('valid-token', mockRepo);
expect(result.success).toBe(true);
});
it('returns error for invalid token', async () => {
const result = await verifyEmail('invalid-token', mockRepo);
expect(result.success).toBe(false);
});
});
Verify:
✅ Type check: PASS
✅ Linter: PASS
❌ Tests: FAIL - "Missing edge case: expired tokens"
Result: Verification failed → Recurse
Iteration 3
Retrieve (augmented again):
// Previous context +
- Test failure: "Missing edge case: expired tokens"
Generate:
// LLM adds edge case handling:
async function verifyEmail(
token: string,
userRepo: UserRepository
): Promise<VerifyEmailResult> {
const user = await userRepo.findByVerificationToken(token);
if (!user) {
return { success: false, error: 'Invalid token' };
}
// Check expiration (added based on test failure)
const tokenAge = Date.now() - user.verificationTokenCreatedAt.getTime();
const maxAge = 24 * 60 * 60 * 1000; // 24 hours
if (tokenAge > maxAge) {
return { success: false, error: 'Token expired' };
}
await userRepo.update(user.id, { emailVerified: true });
return { success: true };
}
// Test added:
it('returns error for expired token', async () => {
const result = await verifyEmail('expired-token', mockRepo);
expect(result.success).toBe(false);
expect(result.error).toContain('expired');
});
Verify:
✅ Type check: PASS
✅ Linter: PASS
✅ Tests: PASS (all 3 tests)
✅ Integration: PASS
Result: All gates passed → Success! Return code.
Total iterations: 3
Convergence time: ~5 minutes
Final quality: High (all gates passed)
The Role of Error Messages
Error messages from verification gates are crucial feedback for the next iteration.
High-Quality Error Messages
// ✅ Good error message (actionable)
"Error: Direct database access not allowed.
Use the repository pattern instead:
import { UserRepository } from './user-repository';
async function example(userRepo: UserRepository) {
const user = await userRepo.findById(userId);
// ...
}
See: packages/auth/CLAUDE.md#repository-pattern"
// LLM can fix this immediately
Low-Quality Error Messages
// ❌ Bad error message (not actionable)
"Error: Architecture violation at line 15"
// LLM doesn't know what to fix → More iterations needed
Principle: Error messages should teach the LLM how to fix the issue.
Integration with Other Patterns
Hierarchical CLAUDE.md Files
Improves the Retrieve phase by organizing context:
Root CLAUDE.md → Global patterns
Domain CLAUDE.md → Domain-specific patterns
Feature CLAUDE.md → Feature-specific patterns
Result: More relevant context → Better generation
Quality Gates as Information Filters
Strengthens the Verify phase:
More gates → More filters → Lower entropy → Higher quality
Prompt Caching
Optimizes the Retrieve phase:
Cached context → Faster retrieval → Lower cost
Test-Based Regression Patching
Enhances the Verify phase:
Each test → Permanent entropy reduction → Fewer regressions
Limitations of the Model
Limitation 1: Bounded Rationality
LLMs have finite context windows. Retrieval must fit within limits:
Claude Sonnet 4: 200K tokens
If context > 200K:
→ Truncation occurs
→ Important context may be lost
→ Generation quality degrades
Solution: Prioritize high-signal context, use hierarchical organization.
Limitation 2: Non-Deterministic Generation
Generation is probabilistic, not deterministic:
# Same input can produce different outputs
for i in range(5):
output = generate(context, task, temperature=0.7)
# output_1 ≠ output_2 ≠ output_3 ...
Solution: Use temperature=0 for maximum determinism, or accept slight variations.
Limitation 3: Verification is Binary
Gates return pass/fail, not quality scores:
Tests pass → ✅ (but code might be suboptimal)
Tests fail → ❌ (but code might be 99% correct)
Solution: Add more granular gates (performance tests, code coverage).
Limitation 4: Local Optima
Recursion may converge to local optima instead of global:
Iteration 1: Solution A (70% optimal, fails test)
Iteration 2: Solution B (60% optimal, passes test) → Converges
Better solution C (95% optimal) never explored
Solution: Use multiple generation attempts, compare alternatives.
Best Practices
1. Optimize Each Phase
Retrieve: Load only relevant, high-signal context
Generate: Provide clear constraints and examples
Verify: Use comprehensive, fast quality gates
2. Minimize Iterations
Goal: First-pass success rate >80%
Strategies:
- Improve context relevance (better retrieval)
- Add type constraints (narrow output space)
- Provide working examples (show correct patterns)
- Write informative error messages (teach LLM)
3. Track Convergence Metrics
const metrics = {
avgIterations: 2.3, // Target: <3
firstPassSuccessRate: 0.85, // Target: >0.8
avgConvergenceTime: 45_000, // Target: <60s
};
4. Design Gates for Fast Feedback
Fast gates (run first):
- Type checking (1-2s)
- Linting (1-2s)
- Unit tests (5-10s)
Slow gates (run last):
- Integration tests (30-60s)
- E2E tests (2-5min)
5. Provide Actionable Feedback
Every error message should teach the LLM how to fix:
❌ "Error: Invalid"
✅ "Error: Expected Promise<AuthResult>, got Promise<boolean>.
Update return type to match AuthResult interface."
Measuring Success
Key Metrics
Convergence rate: Percentage of tasks that converge to valid solution
Target: >95%
Average iterations: How many loops until all gates pass
Target: <3 iterations
First-pass success: Percentage of generations that pass all gates on first try
Target: >80%
Entropy reduction: How much constraints narrow output space
Target: >99% reduction
Success Indicators
High-performing system:
- 90%+ first-pass success rate
- <2 average iterations
- <30s convergence time
- Consistent outputs across runs
Low-performing system:
- <50% first-pass success rate
-
5 average iterations
-
2min convergence time
- Inconsistent outputs across runs
Conclusion
The recursive function model provides a mathematical framework for understanding AI coding agents:
AI_Agent = fn(Verify(Generate(Retrieve())))
Key insights:
- Retrieval quality determines generation accuracy → Optimize context loading
- Quality gates are entropy filters → Add comprehensive gates
- Each iteration provides learning → Write informative error messages
- Convergence is bounded → Track and optimize iteration count
- The model is universal → Applies to all AI coding agents
Understanding this model helps you:
- Design better workflows (optimize each phase)
- Debug LLM behavior (trace through the model)
- Measure performance (track convergence metrics)
- Build better tools (enhance retrieve/generate/verify)
The result: AI coding agents that converge faster, produce higher quality code, and require less human intervention—not by chance, but by design.
Related Concepts
- Quality Gates as Information Filters: How verification gates reduce entropy
- Entropy in Code Generation: Mathematical framework for understanding uncertainty
- Hierarchical Context Patterns: Optimizing the retrieval phase
- Test-Based Regression Patching: Building verification gates incrementally
- Prompt Caching Strategy: Optimizing retrieval performance and cost
Mathematical Foundation
$$\text{AI_Agent} = f(\text{Verify}(\text{Generate}(\text{Retrieve}())))$$
Understanding the Recursive Function Model
The formula AI_Agent = f(Verify(Generate(Retrieve()))) describes how AI coding agents operate.
Let’s break it down from the inside out:
Retrieve() – Context Gathering
Retrieve is a function that takes no visible arguments (though internally it knows the task) and returns context.
function Retrieve(): Context {
// Returns:
// - CLAUDE.md files
// - Schema definitions
// - Example code
// - Test files
// - Relevant patterns
return context;
}
Output: A set of relevant information (context) for the task.
Example: For task “implement authentication”, Retrieve might return:
- packages/auth/CLAUDE.md (patterns)
- schemas/user.schema.json (types)
- tests/auth.test.ts (expected behavior)
Generate(context) – Code Production
Generate is a function that takes context as input and returns generated code.
function Generate(context: Context): Code {
// LLM samples from probability distribution
// P(code | context)
// Produces code based on patterns in context
return generatedCode;
}
Input: Context from Retrieve()
Output: Generated code
Example: Given auth patterns in context, Generate produces:
async function authenticate(email: string, password: string) {
// ... implementation
}
Verify(code) – Quality Gates
Verify is a function that takes code as input and returns pass/fail status.
function Verify(code: Code): VerificationResult {
// Run quality gates:
const typeCheck = runTypeChecker(code); // Pass/Fail
const lintCheck = runLinter(code); // Pass/Fail
const testCheck = runTests(code); // Pass/Fail
return {
passed: typeCheck && lintCheck && testCheck,
errors: [...]
};
}
Input: Generated code
Output: Verification result (pass/fail + errors)
Example: Code runs through:
- Type checker: ✅ PASS
- Linter: ❌ FAIL (missing return type)
- Tests: ❌ FAIL (missing edge case)
Result: { passed: false, errors: [...] }
f() – Recursive Function
f is the outer function that orchestrates the loop.
function f(
verificationResult: VerificationResult,
maxIterations: number = 10
): Code {
let iteration = 0;
let previousErrors = [];
while (iteration < maxIterations) {
// Retrieve context (augmented with previous errors)
const context = Retrieve(previousErrors);
// Generate code
const code = Generate(context);
// Verify code
const result = Verify(code);
if (result.passed) {
return code; // Success!
}
// Failed → add errors for next iteration
previousErrors.push(result.errors);
iteration++;
}
throw new Error('Max iterations exceeded');
}
Purpose: Keep looping until Verify passes (or max iterations reached)
Putting It All Together
The full expression AI_Agent = f(Verify(Generate(Retrieve()))) means:
// Step 1: Retrieve context
const context = Retrieve(); // Load CLAUDE.md, schemas, examples
// Step 2: Generate code using context
const code = Generate(context); // LLM produces code
// Step 3: Verify code through gates
const result = Verify(code); // Type check, lint, test
// Step 4: Recursive loop
if (result.passed) {
return code; // Done!
} else {
// Loop back to step 1 with error feedback
const context = Retrieve(result.errors); // Augmented context
const code = Generate(context); // Try again
const result = Verify(code); // Verify again
// ... continues until pass or max iterations
}
Reading the Composition
The nested function calls show the order of operations:
f(Verify(Generate(Retrieve())))
From inside out:
- Retrieve() runs first → produces context
- Generate(context) runs second → produces code
- Verify(code) runs third → produces result
- f(result) runs last → loops if needed
From outside in (execution flow):
- f starts the loop
- f calls Verify
- Verify needs code, so calls Generate
- Generate needs context, so calls Retrieve
- Retrieve provides context
- Generate produces code
- Verify checks code
- f decides: return code or loop
Why This Model Matters
Insight 1: It’s a pipeline with feedback:
Retrieve → Generate → Verify
↑ |
└────── loop ────────┘
Insight 2: Each function has a specific role:
- Retrieve: Gather information
- Generate: Create code
- Verify: Check quality
- f: Orchestrate loop
Insight 3: Optimization points are clear:
- Better Retrieve → Better context → Better generation
- More constraints in Generate → Lower entropy → Better code
- Stricter Verify → Higher quality → Fewer bugs
- Smarter f → Fewer iterations → Faster convergence
Concrete Example
Task: “Add email validation to signup”
// Iteration 1
const context1 = Retrieve();
// Returns: user schema, validation patterns
const code1 = Generate(context1);
// Produces: basic email check with regex
const result1 = Verify(code1);
// Tests fail: doesn't handle unicode emails
// Iteration 2 (f loops)
const context2 = Retrieve(result1.errors);
// Returns: previous context + test failure details
const code2 = Generate(context2);
// Produces: improved validation with unicode support
const result2 = Verify(code2);
// All tests pass!
return code2; // Success after 2 iterations
Total formula execution:
AI_Agent = f(Verify(Generate(Retrieve())))
= f(Verify(Generate({schemas, patterns})))
= f(Verify({basic email check}))
= f({failed: test error})
= [loop]
= f(Verify(Generate({schemas, patterns, test error})))
= f(Verify({improved email check}))
= f({passed: true})
= {improved email check} ← Final output
Key Takeaway
The formula is recursive (function calls itself) and compositional (functions nest inside each other).
This precisely describes how all AI coding agents work, regardless of implementation details.
Related Concepts
- Information Theory for Coding Agents – Mathematical foundations (entropy, mutual information, channel capacity) underlying the recursive model
- Entropy in Code Generation – How constraints reduce entropy in the Generate phase
- Invariants in Programming and LLM Generation – How invariants constrain the Verify phase
- Quality Gates as Information Filters – How verification gates reduce state space
- Making Invalid States Impossible – Sculpting the computation graph to prevent invalid states
- Hierarchical Context Patterns – Optimizing the Retrieve phase
- Test-Based Regression Patching – Building verification gates incrementally
- Prompt Caching Strategy – Optimizing retrieval performance and cost
- Trust But Verify Protocol – Implementing verification through AI-generated tests
References
- Recursive Functions in Computer Science – Background on recursive functions and their properties
- LLM Agents – Anthropic Research – Research papers on LLM agent architectures

