Summary
Reviewing all AI-generated code manually is time-consuming and error-prone. Instead of reviewing 1000+ lines of generated code, ask the AI to write verification tests and review just the test output. This reduces review burden by 99%, catches bugs at generation time, and creates compound learning where each verification teaches the AI what ‘correct’ looks like.
The Problem
Reviewing all AI-generated code is time-consuming and error-prone. For a 1000-line feature, manually checking every function, edge case, and integration point takes hours and bugs still slip through. Traditional code review assumes human-written code with human reasoning – but AI-generated code lacks this context, making review even harder.
The Solution
Don’t trust AI output – ask AI to create verification instead. The pattern: AI writes code → AI writes verification (tests, scripts, visual checks) → You review verification output. This shifts focus from reviewing implementation details to validating behavior. Instead of reading 1000 lines of code, you check 10 lines of test output. Bugs caught immediately while context is fresh, and verification artifacts compound into a quality gate system.
The Problem
When working with AI coding agents, you face a fundamental challenge: how do you verify generated code is correct?
The naive approach is manual code review:
AI: "I've implemented user authentication with password hashing,
session management, and rate limiting. Here are 847 lines of code."
You: *Starts reading line by line*
- Is the password hash secure?
- Are sessions properly invalidated?
- Is rate limiting configured correctly?
- Are edge cases handled?
- Is error handling complete?
- Are there race conditions?
*3 hours later, eyes glazing over*
Why Manual Review Fails
1. Scale Problem
AI can generate code 10-100x faster than humans can review it:
- AI generation: 1000 lines in 2 minutes
- Human review: 1000 lines in 2-4 hours
- Result: Review becomes the bottleneck
2. Context Loss
By the time you finish reviewing, you’ve forgotten earlier parts:
Line 1-200: "This authentication logic looks good..."
Line 400-600: "Wait, how does this relate to the session management?"
Line 800-1000: "I need to re-read the beginning to understand this..."
3. False Confidence
Code that looks correct often isn’t:
// Looks good at first glance...
async function createUser(email: string, password: string) {
const hash = await bcrypt.hash(password, 10);
const user = await db.users.create({ email, passwordHash: hash });
return user;
}
// But missing:
// - Email validation
// - Duplicate email check
// - Password strength requirements
// - Input sanitization
// - Error handling
// - Transaction rollback
You think you’ve reviewed it thoroughly, but missed 6 critical issues.
4. Missed Edge Cases
Humans are bad at systematically checking edge cases:
// Did you verify:
// - Empty string inputs?
// - Null/undefined values?
// - Maximum length strings?
// - Special characters?
// - Unicode edge cases?
// - Concurrent requests?
// - Database connection failures?
// - Network timeouts?
Probably not. Too tedious.
5. No Regression Protection
Even if you catch all bugs during review, there’s no artifact preventing regression:
Today: You manually verify authentication works
1 week later: AI modifies authentication code
Result: Previous bugs can re-emerge, no automated check
The Cost
Time cost:
- 1000-line feature = 3 hours of review
- 5 features/week = 15 hours/week reviewing
- 37% of your time spent reading code
Quality cost:
- Bugs slip through review (human error)
- No systematic edge case coverage
- No regression protection
- False confidence in “reviewed” code
Productivity cost:
- Review becomes bottleneck
- AI sits idle waiting for approval
- Iteration slows down
- Development velocity tanks
The Solution
Don’t trust AI output – ask AI to create verification instead.
The Trust But Verify Pattern
Instead of:
1. AI writes code
2. You review everything
3. Bugs slip through
Do this:
1. AI writes code
2. AI writes verification (tests, scripts, visual checks)
3. AI runs verification
4. You review verification output (10 lines vs 1000 lines)
5. Fix any failures immediately while context is fresh
Why This Works
1. Verification is Easier to Review
Compare these review tasks:
Manual review:
// Review 847 lines of authentication code
// Mentally execute all edge cases
// Try to spot security vulnerabilities
// Guess at race conditions
// Wonder about error handling
*3 hours of intense concentration*
Verification review:
# Review test output
✅ User registration with valid data: PASSED
✅ Duplicate email rejection: PASSED
✅ Password strength validation: PASSED
✅ SQL injection prevention: PASSED
✅ Rate limiting (100 requests): PASSED
✅ Session expiration: PASSED
✅ Concurrent registration (race condition): PASSED
❌ Password reset token expiration: FAILED
Expected: Token expires after 1 hour
Actual: Token never expires
*30 seconds to spot the issue*
2. Verification is Systematic
Tests check every edge case, every time:
// AI generates comprehensive test suite
describe('User Authentication', () => {
it('accepts valid email formats', () => { ... });
it('rejects invalid email formats', () => { ... });
it('requires password >= 8 characters', () => { ... });
it('requires password with uppercase', () => { ... });
it('requires password with number', () => { ... });
it('requires password with special char', () => { ... });
it('prevents SQL injection in email', () => { ... });
it('prevents SQL injection in password', () => { ... });
it('rate limits registration attempts', () => { ... });
it('handles database connection errors', () => { ... });
// ... 50+ more tests
});
Human review would skip most of these. Tests check them all, every time.
3. Verification Creates Artifacts
Tests become permanent quality gates:
Day 1: AI writes auth code + tests
Tests pass ✅
Day 7: AI modifies auth code
Tests catch regression ❌
AI fixes issue
Tests pass ✅
Day 30: AI refactors auth code
Tests ensure behavior unchanged ✅
4. Compound Learning
Each verification teaches the AI what “correct” looks like:
Iteration 1:
Code: Missing rate limiting
Verification: Test fails
Learning: "Rate limiting is required"
Iteration 2:
Code: Includes rate limiting
Verification: Test passes
Learning: "This is the correct pattern"
Iteration 3+:
Code: Automatically includes rate limiting
Verification: Confirms it's correct
The AI gets better with each verification cycle.
Implementation
Essential Verification Patterns
Pattern 1: Runtime Verification
Use case: Verify code actually works as intended
Example prompt:
"Implement user authentication API endpoint.
After implementation, create a verification script that:
1. Starts the server
2. Tests all endpoints with valid/invalid data
3. Checks response codes and data
4. Verifies database state
5. Cleans up test data
6. Reports all results
Run the verification script and show me the output."
AI generates:
// verify-auth.ts
import { startServer, stopServer } from './test-server';
import { resetDatabase } from './test-utils';
async function verifyAuthentication() {
console.log('🔍 Starting authentication verification...
');
await startServer();
await resetDatabase();
const results = [];
// Test 1: User registration
try {
const res = await fetch('http://localhost:3000/api/auth/register', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
email: '[email protected]',
password: 'SecurePass123!'
})
});
if (res.status === 201) {
const user = await res.json();
if (user.id && user.email === '[email protected]') {
results.push({ test: 'User registration', status: 'PASS' });
} else {
results.push({ test: 'User registration', status: 'FAIL', reason: 'Invalid response data' });
}
} else {
results.push({ test: 'User registration', status: 'FAIL', reason: `Expected 201, got ${res.status}` });
}
} catch (error) {
results.push({ test: 'User registration', status: 'FAIL', reason: error.message });
}
// Test 2: Duplicate email rejection
try {
const res = await fetch('http://localhost:3000/api/auth/register', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
email: '[email protected]',
password: 'SecurePass123!'
})
});
if (res.status === 409) {
results.push({ test: 'Duplicate email rejection', status: 'PASS' });
} else {
results.push({ test: 'Duplicate email rejection', status: 'FAIL', reason: `Expected 409, got ${res.status}` });
}
} catch (error) {
results.push({ test: 'Duplicate email rejection', status: 'FAIL', reason: error.message });
}
// Test 3: Password strength validation
// Test 4: SQL injection prevention
// Test 5: Rate limiting
// ... more tests
await stopServer();
// Print results
console.log('
📊 Verification Results:
');
results.forEach(({ test, status, reason }) => {
const icon = status === 'PASS' ? '✅' : '❌';
console.log(`${icon} ${test}: ${status}`);
if (reason) console.log(` Reason: ${reason}`);
});
const passCount = results.filter(r => r.status === 'PASS').length;
const failCount = results.filter(r => r.status === 'FAIL').length;
console.log(`
${passCount}/${results.length} tests passed`);
if (failCount > 0) {
process.exit(1);
}
}
verifyAuthentication();
Your review: Scan output for failures, not code
Pattern 2: Visual Verification
Use case: Verify UI components render correctly
Example prompt:
"Implement a user profile page with:
- Avatar upload
- Name and bio editing
- Email change with verification
- Password reset
After implementation, create a Playwright script that:
1. Takes screenshots of all UI states (empty, filled, error states)
2. Tests all interactions (upload, edit, save, cancel)
3. Verifies form validation
4. Saves screenshots to ./screenshots/profile/
Run the script and show me the screenshots."
AI generates:
// verify-profile-ui.ts
import { chromium } from 'playwright';
import { startServer, stopServer } from './test-server';
import fs from 'fs';
async function verifyProfileUI() {
await startServer();
const browser = await chromium.launch();
const page = await browser.newPage();
// Ensure screenshot directory exists
fs.mkdirSync('./screenshots/profile', { recursive: true });
// Test 1: Empty state
await page.goto('http://localhost:3000/profile');
await page.screenshot({ path: './screenshots/profile/01-empty-state.png' });
// Test 2: Filled state
await page.fill('[name="name"]', 'John Doe');
await page.fill('[name="bio"]', 'Software engineer passionate about AI');
await page.screenshot({ path: './screenshots/profile/02-filled-state.png' });
// Test 3: Validation errors
await page.fill('[name="email"]', 'invalid-email');
await page.click('button[type="submit"]');
await page.screenshot({ path: './screenshots/profile/03-validation-errors.png' });
// Test 4: Avatar upload
await page.setInputFiles('[type="file"]', './test-fixtures/avatar.png');
await page.screenshot({ path: './screenshots/profile/04-avatar-uploaded.png' });
// Test 5: Success state
await page.fill('[name="email"]', '[email protected]');
await page.click('button[type="submit"]');
await page.waitForSelector('.success-message');
await page.screenshot({ path: './screenshots/profile/05-success-state.png' });
await browser.close();
await stopServer();
console.log('✅ Screenshots saved to ./screenshots/profile/');
console.log('📸 Review the following states:');
console.log(' 1. Empty state');
console.log(' 2. Filled state');
console.log(' 3. Validation errors');
console.log(' 4. Avatar uploaded');
console.log(' 5. Success state');
}
verifyProfileUI();
Your review: Look at 5 screenshots instead of reading 500 lines of React code
Pattern 3: Data Verification
Use case: Verify data migrations, transformations, or bulk operations
Example prompt:
"Create a script that migrates user data from old schema to new schema.
After implementation, create a verification script that:
1. Counts records before migration
2. Runs migration
3. Counts records after migration
4. Validates data integrity (no data loss)
5. Checks for duplicate records
6. Verifies all foreign keys are valid
7. Generates a detailed report
Run the verification and show me the report."
AI generates:
// verify-migration.ts
import { db } from './database';
async function verifyMigration() {
console.log('🔍 Verifying data migration...
');
// Pre-migration counts
const beforeCounts = {
users: await db.old_users.count(),
profiles: await db.old_profiles.count(),
posts: await db.old_posts.count(),
};
console.log('📊 Before migration:');
console.log(` Users: ${beforeCounts.users}`);
console.log(` Profiles: ${beforeCounts.profiles}`);
console.log(` Posts: ${beforeCounts.posts}
`);
// Run migration
console.log('🚀 Running migration...');
await runMigration();
console.log('✅ Migration complete
');
// Post-migration counts
const afterCounts = {
users: await db.users.count(),
profiles: await db.profiles.count(),
posts: await db.posts.count(),
};
console.log('📊 After migration:');
console.log(` Users: ${afterCounts.users}`);
console.log(` Profiles: ${afterCounts.profiles}`);
console.log(` Posts: ${afterCounts.posts}
`);
// Verification checks
const checks = [];
// Check 1: No data loss
checks.push({
name: 'No user data loss',
passed: beforeCounts.users === afterCounts.users,
expected: beforeCounts.users,
actual: afterCounts.users,
});
// Check 2: No duplicates
const duplicateEmails = await db.users
.groupBy('email')
.having('COUNT(*) > 1')
.select('email', 'COUNT(*)');
checks.push({
name: 'No duplicate emails',
passed: duplicateEmails.length === 0,
expected: 0,
actual: duplicateEmails.length,
});
// Check 3: Foreign key integrity
const orphanedProfiles = await db.profiles
.leftJoin('users', 'profiles.user_id', 'users.id')
.whereNull('users.id')
.count();
checks.push({
name: 'All profiles have valid user_id',
passed: orphanedProfiles === 0,
expected: 0,
actual: orphanedProfiles,
});
// Print results
console.log('🔍 Verification Checks:
');
checks.forEach(check => {
const icon = check.passed ? '✅' : '❌';
console.log(`${icon} ${check.name}`);
if (!check.passed) {
console.log(` Expected: ${check.expected}`);
console.log(` Actual: ${check.actual}`);
}
});
const allPassed = checks.every(c => c.passed);
console.log(`
${allPassed ? '✅ All checks passed' : '❌ Some checks failed'}`);
if (!allPassed) process.exit(1);
}
verifyMigration();
Your review: Read the summary report, not database queries
Pattern 4: API Verification
Use case: Verify API endpoints work correctly
Example prompt:
"Implement a REST API for task management with CRUD operations.
After implementation, create a comprehensive API test suite that:
1. Tests all endpoints (GET, POST, PUT, DELETE)
2. Tests with valid and invalid data
3. Tests authentication and authorization
4. Tests rate limiting
5. Tests error handling
6. Generates a test coverage report
Run the tests and show me the results."
AI generates:
// verify-tasks-api.test.ts
import { describe, it, expect, beforeAll, afterAll, beforeEach } from 'vitest';
import { startServer, stopServer } from './test-server';
import { resetDatabase, createTestUser } from './test-utils';
describe('Tasks API Verification', () => {
let authToken: string;
let userId: string;
beforeAll(async () => {
await startServer();
});
afterAll(async () => {
await stopServer();
});
beforeEach(async () => {
await resetDatabase();
const { token, id } = await createTestUser();
authToken = token;
userId = id;
});
describe('POST /api/tasks', () => {
it('creates task with valid data', async () => {
const res = await fetch('http://localhost:3000/api/tasks', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${authToken}`,
},
body: JSON.stringify({
title: 'Test task',
description: 'Task description',
}),
});
expect(res.status).toBe(201);
const task = await res.json();
expect(task).toMatchObject({
title: 'Test task',
description: 'Task description',
userId,
});
});
it('rejects missing title', async () => {
const res = await fetch('http://localhost:3000/api/tasks', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${authToken}`,
},
body: JSON.stringify({
description: 'Task description',
}),
});
expect(res.status).toBe(400);
});
it('requires authentication', async () => {
const res = await fetch('http://localhost:3000/api/tasks', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ title: 'Test' }),
});
expect(res.status).toBe(401);
});
});
describe('GET /api/tasks', () => {
it('returns user tasks only', async () => {
// Create tasks for this user
await createTask(authToken, { title: 'Task 1' });
await createTask(authToken, { title: 'Task 2' });
// Create task for different user
const { token: otherToken } = await createTestUser();
await createTask(otherToken, { title: 'Other task' });
const res = await fetch('http://localhost:3000/api/tasks', {
headers: { 'Authorization': `Bearer ${authToken}` },
});
expect(res.status).toBe(200);
const tasks = await res.json();
expect(tasks).toHaveLength(2);
expect(tasks.every(t => t.userId === userId)).toBe(true);
});
});
// More tests for PUT, DELETE, etc...
});
Your review: Check test output, not API implementation code
Step-by-Step Workflow
Step 1: Request Implementation + Verification
Instead of:
"Implement user authentication"
Ask for:
"Implement user authentication.
After implementation, create a comprehensive verification suite that tests:
1. User registration (valid/invalid data)
2. Login (correct/incorrect credentials)
3. Password hashing (never stored plain text)
4. Session management (creation, validation, expiration)
5. Rate limiting (prevent brute force)
6. Security (SQL injection, XSS prevention)
Run the verification suite and show me the results."
Step 2: Review Verification Output
AI runs tests and shows:
✅ User registration with valid email: PASSED
✅ User registration rejects invalid email: PASSED
✅ User registration requires password >= 8 chars: PASSED
❌ Duplicate email handling: FAILED
Expected: 409 Conflict
Actual: 500 Internal Server Error
✅ Login with correct credentials: PASSED
❌ Login rate limiting after 5 attempts: FAILED
Expected: 429 Too Many Requests after 5 attempts
Actual: No rate limiting detected
✅ Password hashing verification: PASSED
✅ Session expiration after 24h: PASSED
✅ SQL injection prevention in email field: PASSED
6/9 tests passed, 3 failed
Your action: Scan for failures (takes 10 seconds)
Step 3: Request Fixes
"Fix the 3 failing tests:
1. Duplicate email should return 409, not 500
2. Implement rate limiting (5 attempts per 15 minutes)
3. Re-run verification after fixes"
Step 4: Verify Fixes
AI shows:
✅ All 9 tests passed
✅ User registration with valid email: PASSED
✅ User registration rejects invalid email: PASSED
✅ User registration requires password >= 8 chars: PASSED
✅ Duplicate email handling: PASSED (FIXED)
✅ Login with correct credentials: PASSED
✅ Login rate limiting after 5 attempts: PASSED (FIXED)
✅ Password hashing verification: PASSED
✅ Session expiration after 24h: PASSED
✅ SQL injection prevention in email field: PASSED
Your action: Confirm all tests pass (5 seconds)
Total review time: 15 seconds instead of 3 hours
Benefits
1. Reduced Review Burden
Before:
- Review 1000 lines of code
- Mentally execute edge cases
- Try to spot bugs visually
- Time: 2-4 hours
After:
- Scan test output (10 lines)
- See which tests pass/fail
- Focus on failures only
- Time: 30 seconds
Reduction: 99% less time
2. Higher Quality
Before:
- Human review misses edge cases
- No systematic coverage
- Bugs slip through
- Bug detection: 40-60%
After:
- Automated tests check every case
- Systematic coverage
- Bugs caught immediately
- Bug detection: 80-95%
Improvement: 2x better bug detection
3. Faster Iteration
Before:
Generate code (5 min) → Wait for review (hours/days) → Fix issues → Wait again
Cycle time: Days
After:
Generate code (5 min) → Generate verification (2 min) → Review output (30 sec) → Fix (5 min)
Cycle time: 15 minutes
Improvement: 100x faster iteration
4. Compound Learning
Verification creates a feedback loop:
Iteration 1: Generate code → Tests fail → Fix → Tests pass
Learning: "This is what correct looks like"
Iteration 2: Generate code → Tests pass first time
Learning: "I remember the correct pattern"
Iteration 3+: Generate increasingly correct code on first attempt
Learning compounds over time
Result: AI gets better with each verification cycle
5. Regression Protection
Tests become permanent quality gates:
Day 1: Feature + tests created
Day 7: Refactoring doesn't break tests ✅
Day 30: New feature doesn't break existing tests ✅
Day 90: Still protected by original tests ✅
Best Practices
1. Always Request Verification
Make it a habit:
❌ "Implement feature X"
✅ "Implement feature X.
After implementation, create verification that tests Y and Z.
Run verification and show results."
2. Specify Verification Criteria
Be explicit about what to verify:
"Create a password reset flow.
Verification must test:
- ✅ Email validation
- ✅ Token generation and expiration (1 hour)
- ✅ Token can only be used once
- ✅ Password strength requirements
- ✅ Old password is actually changed
- ✅ User can login with new password
- ✅ Old password no longer works
- ✅ Rate limiting on reset requests
- ✅ Email sending (mock or real)
Run verification and show results."
3. Request Multiple Verification Types
Combine different verification patterns:
"Implement checkout flow.
Create verification suite with:
1. Integration tests (API endpoints)
2. Playwright tests (UI flow)
3. Data verification (order created in database)
4. Email verification (confirmation sent)
Run all verifications and show results."
4. Use Verification Output as Documentation
Test output documents behavior:
# This output IS the documentation
✅ Cart - Add item increases quantity
✅ Cart - Remove item decreases quantity
✅ Cart - Empty cart shows empty state
✅ Checkout - Validates credit card format
✅ Checkout - Calculates tax based on shipping address
✅ Checkout - Sends confirmation email
✅ Checkout - Creates order in database
✅ Checkout - Clears cart after successful order
Anyone reading this understands what the system does.
5. Keep Verification Scripts
Don’t throw away verification scripts:
project/
├── src/
│ └── features/
│ ├── auth/
│ │ ├── auth.service.ts
│ │ └── verify-auth.ts ← Keep this
│ ├── checkout/
│ │ ├── checkout.service.ts
│ │ └── verify-checkout.ts ← Keep this
│ └── user/
│ ├── user.service.ts
│ └── verify-user.ts ← Keep this
Run them in CI/CD for continuous verification.
6. Review Verification Code Too
Occasionally review the verification code itself:
"Show me the verification code for the authentication tests.
I want to ensure:
1. All edge cases are covered
2. Tests are actually testing the right things
3. No false positives
"
But this is much faster than reviewing implementation code.
Common Pitfalls
❌ Pitfall 1: Trusting Verification Without Running It
Problem: Assuming verification works without actually running it
AI: "I've created verification tests."
You: "Great!" *Ships to production*
Production: *Everything breaks*
Solution: Always require AI to run verification and show output
"Create verification tests, RUN THEM, and show me the output."
❌ Pitfall 2: Accepting Partial Verification
Problem: Only verifying happy path
// Incomplete verification
it('creates user', async () => {
const user = await createUser('[email protected]', 'password');
expect(user.email).toBe('[email protected]');
});
// Missing:
// - Duplicate email test
// - Invalid email test
// - Weak password test
// - SQL injection test
// - Rate limiting test
Solution: Explicitly request edge case coverage
"Verification must test:
- Happy path
- All error cases
- Edge cases (empty, null, max length)
- Security (injection, XSS)
- Performance (rate limiting, timeouts)"
❌ Pitfall 3: Not Fixing Failures Immediately
Problem: Seeing failures but fixing “later”
❌ 3 tests failed
You: "I'll fix those later"
*Context lost, takes 10x longer to fix*
Solution: Fix failures immediately while context is fresh
"3 tests failed. Fix them now and re-run verification."
❌ Pitfall 4: Over-Relying on Unit Tests
Problem: Generating unit tests with mocks instead of integration tests
// Low-value verification
const mockDb = { create: jest.fn() };
await service.createUser('[email protected]');
expect(mockDb.create).toHaveBeenCalled(); // Meaningless
Solution: Prefer integration tests that verify real behavior
"Create INTEGRATION tests that verify the actual API endpoints
with a real test database, not mocked dependencies."
Integration with Other Patterns
Combine with Integration Tests
Trust But Verify works best with integration tests:
"Implement payment processing.
Create integration tests that:
1. Start test server with test database
2. Test complete payment flows
3. Verify database state after each operation
4. Test with real Stripe test mode
Run tests and show results."
Combine with Claude Code Hooks
Automate verification in hooks:
# .claudehooks/post-write
#!/bin/bash
# Run verification after any code change
if <a href="/posts/claude-files-changed-auth-service-ts/">$CLAUDE_FILES_CHANGED == *"auth.service.ts"*</a>; then
echo "Running auth verification..."
npm run verify:auth
fi
Combine with Evaluation Driven Development
Use verification as your evaluation:
"Implement feature X.
Evaluation criteria (must pass):
1. All integration tests pass
2. All Playwright tests pass
3. All security tests pass
4. All performance tests pass
Only mark complete when ALL evaluations pass."
Measuring Success
Key Metrics
1. Review Time Reduction
Before: 3 hours reviewing 1000 lines of code
After: 30 seconds reviewing test output
Reduction: 99.7%
2. Bug Detection Rate
Before: Manual review catches 40-60% of bugs
After: Automated verification catches 80-95% of bugs
Improvement: 2x better
3. Iteration Speed
Before: 1-2 iterations per day (waiting for review)
After: 10-20 iterations per day (immediate verification)
Improvement: 10x faster
4. Regression Rate
Before: 20-30% of bugs are regressions
After: <5% regressions (tests prevent them)
Improvement: 6x fewer regressions
Tracking Dashboard
interface VerificationMetrics {
totalVerifications: number;
passRate: number;
avgFixTime: number; // minutes
bugsPreventedByVerification: number;
reviewTimeSaved: number; // hours
}
const metrics: VerificationMetrics = {
totalVerifications: 347,
passRate: 0.73, // 73% pass first time
avgFixTime: 8, // 8 minutes to fix failures
bugsPreventedByVerification: 234,
reviewTimeSaved: 520, // hours
};
Conclusion
The Trust But Verify Protocol shifts your role from code reviewer to quality validator:
Old approach:
- Read every line of generated code
- Mentally execute edge cases
- Try to spot bugs visually
- Result: Slow, error-prone, tedious
Trust But Verify:
- AI generates code + verification
- Scan verification output
- Fix failures immediately
- Result: Fast, systematic, effective
The pattern:
1. AI writes code
2. AI writes verification (tests, scripts, checks)
3. AI runs verification
4. You review output (not code)
5. Fix failures while context is fresh
6. Verification becomes permanent quality gate
The benefits:
- ✅ 99% reduction in review time
- ✅ 2x better bug detection
- ✅ 10x faster iteration
- ✅ Compound learning (AI improves over time)
- ✅ Regression protection (tests prevent backsliding)
The mindset shift:
From: "I need to review this code to ensure it's correct"
To: "I need to see evidence this code works correctly"
Don’t trust AI output. But don’t manually review everything either.
Trust, but verify. Through automation.
Related Concepts
- Integration Tests Over Unit Tests: Prefer integration tests for higher-signal verification
- Evaluation Driven Development: Use verification as evaluation criteria
- Test-Based Regression Patching: Write tests that make bugs illegal
- Claude Code Hooks as Quality Gates: Automate verification in development hooks
- Playwright Script Loop: Generate visual verification scripts for UI testing
Related Concepts
- integration-testing-patterns
- evaluation-driven-development
- test-based-regression-patching
- claude-code-hooks-quality-gates
- quality-gates-as-information-filters
- stateless-verification-loops
- llm-recursive-function-model
- verification-sandwich-pattern
- YOLO Mode Configuration – Eliminate permission prompts by trusting quality gates instead of manual approval
References
- Playwright Documentation – Browser automation for visual verification testing
- Vitest Documentation – Fast test framework for verification scripts

