Summary
LLMs generate code without knowing if the current state is clean, leading to confusion about whether failures are new or pre-existing. The verification sandwich pattern solves this by running all quality gates before and after generation, establishing a clean baseline and making it obvious when new issues are introduced.
The Problem
LLMs generate code without knowing if the current state is clean. When tests fail after generation, it’s unclear if the LLM broke something or if tests were already failing. This ambiguity wastes time debugging pre-existing issues instead of focusing on new changes. Without a baseline, every failure looks like a new problem.
The Solution
Always sandwich code generation between two verification steps: (1) Pre-Verification establishes a clean baseline by running all quality gates before making changes, (2) Generation makes the code changes, (3) Post-Verification runs the same gates again to detect only new issues. This pattern makes it obvious what changed and eliminates debugging of pre-existing failures.
The Problem: Ambiguous Failures
Imagine asking an LLM to add a new feature. It generates code, you run the tests, and 3 tests fail.
Question: Did the LLM break something, or were those tests already failing?
Without knowing the baseline state, you can’t tell. This leads to:
- Wasted debugging time investigating pre-existing failures
- False blame on the LLM for issues it didn’t cause
- Missed regressions when new failures are hidden among old ones
- Uncertainty about whether it’s safe to merge
Real-World Example
# You ask the LLM to add user authentication
$ claude "Add user authentication to the API"
# LLM generates code
# You run tests
$ npm test
FAILED:
- user.test.ts:45 - "should hash passwords"
- user.test.ts:67 - "should validate email format"
- auth.test.ts:23 - "should return 401 for invalid token"
# Now what?
# Were these tests failing before?
# Did the LLM break them?
# Are they related to authentication at all?
You have no idea what changed. The only way to know is to:
- Revert the LLM’s changes
- Run tests again
- Compare the results
This takes 5-10 minutes every time you generate code.
The Solution: Verification Sandwich
The verification sandwich pattern eliminates ambiguity by always knowing your baseline.
┌─────────────────────────────────────┐
│ 1. PRE-VERIFICATION (Baseline) │
│ ├─ Run tests → All pass ✓ │
│ ├─ Run type check → Clean ✓ │
│ └─ Run linter → Clean ✓ │
├─────────────────────────────────────┤
│ 2. GENERATION │
│ └─ Make the code change │
├─────────────────────────────────────┤
│ 3. POST-VERIFICATION (Delta) │
│ ├─ Run tests → Detect failures │
│ ├─ Run type check → Find errors │
│ └─ Run linter → Catch issues │
└─────────────────────────────────────┘
The Key Insight
If pre-verification fails, STOP immediately. Don’t generate code on top of a broken baseline.
This forces you to:
- Fix existing issues first
- Establish a clean state
- Only then make new changes
Result: Post-verification failures are guaranteed to be from the new changes.
Implementation
Step 1: Define Your Quality Gates
A quality gate is any automated check that verifies correctness:
# Common quality gates
npm test # Unit & integration tests
npm run type-check # TypeScript type checking
npm run lint # ESLint
npm run format:check # Prettier
npm run build # Compilation
Choose gates that are:
- Fast (< 30 seconds total)
- Deterministic (same input → same output)
- Comprehensive (cover most common errors)
Step 2: Create a Verification Script
#!/bin/bash
# scripts/verify.sh
set -e # Exit on any failure
echo "🔍 Running quality gates..."
echo " ├─ Type checking..."
npm run type-check
echo " ├─ Linting..."
npm run lint
echo " ├─ Testing..."
npm test
echo " └─ Building..."
npm run build
echo "✅ All quality gates passed!"
Key details:
- Use
set -eto stop on first failure - Run fast checks first (type-check before tests)
- Provide clear output showing progress
Step 3: Pre-Verification Hook
Use Claude Code hooks to automatically run verification before generation:
// .claude/config.json
{
"hooks": {
"pre-request": "./scripts/verify.sh"
}
}
Now, every time you ask Claude to generate code:
# You run:
$ claude "Add user authentication"
# Automatically runs first:
$ ./scripts/verify.sh
🔍 Running quality gates...
├─ Type checking... ✓
├─ Linting... ✓
├─ Testing... ✓
└─ Building... ✓
✅ All quality gates passed!
# Only then does generation happen
If verification fails, the hook blocks the request:
$ claude "Add user authentication"
$ ./scripts/verify.sh
🔍 Running quality gates...
├─ Type checking... ✓
├─ Linting... ✓
├─ Testing... ✗
FAILED: user.test.ts:45 - "should hash passwords"
❌ Quality gates failed. Fix issues before generating new code.
[Request blocked]
This forces you to fix the failing test before proceeding.
Step 4: Post-Verification Hook
After generation, automatically run verification again:
// .claude/config.json
{
"hooks": {
"pre-request": "./scripts/verify.sh",
"post-request": "./scripts/verify.sh"
}
}
Now the full workflow is:
$ claude "Add user authentication"
# 1. Pre-verification
🔍 Running quality gates...
✓ All gates pass
# 2. Generation
📝 Adding authentication...
├─ Created src/auth.ts
├─ Updated src/api.ts
└─ Added tests in auth.test.ts
# 3. Post-verification
🔍 Running quality gates...
├─ Type checking... ✓
├─ Linting... ✓
├─ Testing... ✗
FAILED: auth.test.ts:23 - "should return 401 for invalid token"
└─ Building... (skipped)
❌ New issues introduced:
- auth.test.ts:23
Key insight: Because pre-verification passed, you know this failure is from the new code.
Step 5: Manual Verification (No Hooks)
If you’re not using hooks, manually run the pattern:
# 1. Pre-verification
$ npm run verify
✅ All quality gates passed!
# 2. Generation
$ claude "Add user authentication"
# ... generates code ...
# 3. Post-verification
$ npm run verify
❌ Tests failed: auth.test.ts:23
This is less automated but still effective.
Advanced Patterns
Pattern 1: Selective Verification
For large codebases, running all tests is slow. Use targeted verification:
#!/bin/bash
# scripts/verify-targeted.sh
set -e
# Get changed files
CHANGED_FILES=$(git diff --name-only HEAD)
if echo "$CHANGED_FILES" | grep -q "src/auth"; then
echo "🔍 Running auth tests..."
npm test -- --testPathPattern=auth
else
echo "🔍 Running all tests..."
npm test
fi
Pattern 2: Progressive Verification
Run fast checks first, skip slow checks if fast ones fail:
#!/bin/bash
# scripts/verify-progressive.sh
set -e
echo "⚡ Fast checks..."
npm run type-check # 2 seconds
npm run lint # 3 seconds
echo "🧪 Running tests (this may take a while)..."
npm test # 30 seconds
echo "🏗️ Building..."
npm run build # 10 seconds
If type-check fails (2 seconds), you don’t waste 40 seconds on tests and build.
Pattern 3: Parallel Verification
Run independent checks in parallel:
#!/bin/bash
# scripts/verify-parallel.sh
set -e
echo "🔍 Running quality gates in parallel..."
# Run checks in parallel
npm run type-check &
PID_TYPECHECK=$!
npm run lint &
PID_LINT=$!
npm test &
PID_TEST=$!
# Wait for all to complete
wait $PID_TYPECHECK || exit 1
wait $PID_LINT || exit 1
wait $PID_TEST || exit 1
echo "✅ All quality gates passed!"
This cuts verification time from 35s to 30s (limited by slowest check).
Pattern 4: Verification with Context
Save pre-verification results for comparison:
#!/bin/bash
# scripts/verify-with-context.sh
set -e
# Run tests and save results
npm test 2>&1 | tee test-results.txt
# Count failures
FAILURES=$(grep -c "FAILED" test-results.txt || true)
if [ "$FAILURES" -gt 0 ]; then
echo "❌ $FAILURES test(s) failed"
exit 1
else
echo "✅ All tests passed"
fi
Then after generation:
# Compare before/after
$ diff test-results-before.txt test-results-after.txt
Real-World Example
Scenario: Adding a new API endpoint
# 1. Pre-verification
$ ./scripts/verify.sh
🔍 Running quality gates...
├─ Type checking... ✓ (0 errors)
├─ Linting... ✓ (0 warnings)
├─ Testing... ✓ (124 passed)
└─ Building... ✓
✅ All quality gates passed!
# 2. Generation
$ claude "Add GET /api/users/:id endpoint"
📝 Adding endpoint...
├─ Created src/api/users.ts
├─ Updated src/api/routes.ts
└─ Added tests in users.test.ts
# 3. Post-verification
$ ./scripts/verify.sh
🔍 Running quality gates...
├─ Type checking... ✗ (1 error)
src/api/users.ts:15:20 - Property 'id' does not exist on type 'Request'
└─ (remaining checks skipped)
❌ New issues introduced:
- Type error in src/api/users.ts:15
# 4. Fix the issue
$ claude "Fix the type error in users.ts"
📝 Fixing type error...
└─ Updated src/api/users.ts (use req.params.id)
# 5. Post-verification (automatic)
$ ./scripts/verify.sh
🔍 Running quality gates...
├─ Type checking... ✓ (0 errors)
├─ Linting... ✓ (0 warnings)
├─ Testing... ✓ (125 passed) [+1 new test]
└─ Building... ✓
✅ All quality gates passed!
Result: You know exactly what changed at each step:
- After first generation: 1 type error introduced
- After fix: Error resolved, all gates pass
When NOT to Use
The verification sandwich pattern isn’t always necessary:
❌ Skip for Trivial Changes
# Documentation updates
$ claude "Fix typo in README.md"
# No need to run tests
# Comment changes
$ claude "Add JSDoc comments to utils.ts"
# Type-check is enough
❌ Skip During Exploration
# Trying different approaches
$ claude "Try implementing this with recursion"
# Run verification manually when done exploring
❌ Skip for Read-Only Requests
# Questions about code
$ claude "Explain how the auth system works"
# No code changes, no verification needed
✅ Always Use for Production Code
# Feature development
$ claude "Add user registration"
✅ Use verification sandwich
# Bug fixes
$ claude "Fix race condition in payment processing"
✅ Use verification sandwich
# Refactoring
$ claude "Extract helper functions from AuthService"
✅ Use verification sandwich
Best Practices
1. Keep Verification Fast
Target: < 30 seconds total
If verification is slow, developers will skip it.
# ✓ Fast verification (20s)
npm run type-check # 2s
npm run lint # 3s
npm test # 15s
# ✗ Slow verification (5 min)
npm run type-check # 2s
npm run lint # 3s
npm test # 15s
npm run e2e-test # 4min 40s ← Too slow!
Solution: Move slow tests to CI, keep local verification fast.
2. Make Verification Obvious
Use clear output that shows exactly what passed/failed:
# ✓ Clear output
🔍 Running quality gates...
├─ Type checking... ✓
├─ Linting... ✓
├─ Testing... ✗
FAILED: auth.test.ts:23
└─ Building... (skipped)
# ✗ Unclear output
Running checks...
Error: Command failed
3. Fail Fast
Stop on first failure instead of running all checks:
# ✓ Fail fast (stops after type-check)
set -e
npm run type-check # ✗ Fails
# npm run lint (skipped)
# npm test (skipped)
# ✗ Run all checks even after failure
npm run type-check || true # ✗ Fails but continues
npm run lint # Runs anyway
npm test # Runs anyway
4. Version Control Integration
Run pre-verification on checkout:
# .git/hooks/post-checkout
#!/bin/bash
echo "🔍 Verifying clean state after checkout..."
./scripts/verify.sh
This catches issues immediately after switching branches.
5. CI/CD Integration
Use the same verification script in CI:
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
- run: npm ci
- run: ./scripts/verify.sh
This ensures local and CI verification are identical.
Common Pitfalls
Pitfall 1: Skipping Pre-Verification
# ✗ Skipping pre-verification
$ claude "Add feature X"
# ... generates code ...
$ npm test
FAILED: 3 tests
# Now you don't know if these 3 failures are new or old
Solution: Always run pre-verification, even if you “think” the state is clean.
Pitfall 2: Ignoring Pre-Verification Failures
# ✗ Continuing despite failures
$ ./scripts/verify.sh
❌ Tests failed: user.test.ts:45
$ claude "Add feature X anyway" ← Bad!
Solution: Fix the baseline before making new changes.
Pitfall 3: Different Pre/Post Verification
# ✗ Different checks
# Pre-verification
npm run type-check
# Post-verification
npm test ← Different gates!
Solution: Use identical verification for pre and post.
Pitfall 4: Non-Deterministic Tests
# ✗ Flaky tests
$ npm test
Passed (124/124)
$ npm test
Failed (1/124) ← Different result!
Solution: Fix flaky tests first, or exclude them from verification.
Measuring Success
Key Metrics
-
Baseline confidence: % of time pre-verification passes
- Target: >95%
-
Delta clarity: % of time post-verification failures are from new code
- Target: 100% (guaranteed by the pattern)
-
Debugging time: Time spent investigating failures
- Target: 50% reduction (no more “was this already broken?”)
-
False blame rate: % of failures blamed on LLM that were pre-existing
- Target: 0% (eliminated by pre-verification)
Conclusion
The verification sandwich pattern is the simplest, highest-impact workflow improvement for AI-assisted development.
Core principle: Never generate code on top of a broken baseline.
Implementation:
- Run all quality gates before generation (pre-verification)
- Make the code change (generation)
- Run all quality gates after generation (post-verification)
Result: Instant clarity on what changed and what broke.
Key insight: Pre-verification failures are blockers, not warnings. Fix them first.
Related Concepts
- Test-Driven Prompting – Write tests before generating code to constrain the solution space
- Quality Gates as Information Filters – Theoretical foundation for why layered verification works
- Compounding Effects of Quality Gates – How stacked gates multiply quality improvements
- Stateless Verification Loops – Ensure each verification starts from clean state, preventing drift
- Incremental Development Pattern – Validate each increment before proceeding
- Plan Mode for Strategic Thinking – Architecture before implementation complements verification
- Claude Code Hooks Quality Gates – Automate verification with pre/post request hooks
- Test-Based Regression Patching – Use failed tests as immediate feedback for fixing issues
- Early Linting Prevents Ratcheting – Catch style issues before they compound
- Integration Testing Patterns – Integration tests provide higher signal than unit tests
- Trust But Verify Protocol – Complement pre/post verification with AI-generated tests
References
- Claude Code Hooks Documentation – How to set up pre/post request hooks for automated verification
- Git Hooks Documentation – Using Git hooks for automated verification on checkout/commit

