Stateless Verification Loops: Preventing State Accumulation in AI Workflows

James Phoenix
James Phoenix

Summary

Stateless verification loops ensure each verification cycle starts from a clean slate, preventing accumulated state from causing drift and false positives. By resetting state between iterations, you achieve reproducible, reliable verification that catches real issues instead of artifacts from previous runs.

The Problem

AI coding workflows often accumulate state across verification cycles—cached build artifacts, leftover test databases, stale type caches, orphaned files. This accumulated state causes verification drift: tests pass locally but fail in CI, code works after third run but not first, bugs hide behind stale state. Over time, the gap between ‘clean’ and ‘dirty’ environments grows, eroding confidence in verification.

The Solution

Design verification loops to be stateless: each iteration starts from a clean environment with no accumulated state. Clear caches, reset databases, remove generated files, restart processes between verification cycles. Use deterministic seeds, fixed timestamps, and controlled randomness. Result: verification behavior is reproducible across runs, environments, and machines—what passes locally passes in CI.

The Problem

In AI-assisted coding workflows, verification happens in loops:

Generate code → Test → Fix → Test → Fix → Test → Deploy

But each iteration can leave behind state:

  • Build artifacts: Compiled files, bundled assets
  • Test databases: Tables with data from previous test runs
  • Cache files: Type checker cache, module resolution cache
  • Generated files: Temporary files, logs, screenshots
  • Process state: Running servers, open connections
  • File system state: Created directories, modified configs

This accumulated state causes verification drift:

Symptom 1: “Works on My Machine”

# Developer's machine (after 10 test runs)
$ npm test
✓ All tests pass (cached modules, warm state)

# CI environment (clean slate)
$ npm test
✗ 5 tests fail (cold start, no cache)

Root cause: Developer’s environment has accumulated helpful state (cached dependencies, populated database) that masks real issues.

Symptom 2: “Works After Third Run”

# First run (clean)
$ npm test
✗ Test fails: "Database table 'users' not found"

# Second run (migration ran)
$ npm test
✗ Test fails: "Duplicate key violation"

# Third run (database reset)
$ npm test
✓ Test passes

Root cause: Each run modifies state (creates tables, inserts data) that affects subsequent runs.

Symptom 3: “Test Pollution”

// Test 1: Creates user with id=1
test('create user', () => {
  const user = createUser({ id: 1, email: '[email protected]' });
  expect(user.id).toBe(1);
});

// Test 2: Assumes database is empty
test('count users', () => {
  const count = countUsers();
  expect(count).toBe(0); // ✗ Fails: count is 1 (from Test 1)
});

Root cause: Test 1 leaves behind state that Test 2 depends on not existing.

Symptom 4: “Flaky Tests”

# Run 1
$ npm test
✓ All tests pass

# Run 2 (immediately after)
$ npm test
✗ 3 tests fail

# Run 3 (after cache clear)
$ npm test
✓ All tests pass

Root cause: Tests inadvertently rely on state from previous runs (timing, order, cached values).

The Solution: Stateless Verification

Stateless verification means each verification cycle starts from a clean slate with no accumulated state.

Core Principle

Every verification run should be indistinguishable from the first run ever executed.

If Test Run 1 and Test Run 100 behave differently, you have state accumulation.

Implementation Strategy

┌─────────────────────────────────────────────────────┐
│ Verification Loop (Stateless)                       │
└─────────────────────────────────────────────────────┘

  ┌──────────────┐
  │ 1. Clean     │  ← Reset all state
  │    Slate     │
  └──────┬───────┘
         │
         ▼
  ┌──────────────┐
  │ 2. Generate  │  ← LLM generates codeCode      │
  └──────┬───────┘
         │
         ▼
  ┌──────────────┐
  │ 3. Run       │  ← Execute verification
  │    Gates     │
  └──────┬───────┘
         │
         ▼
    ┌────┴────┐
    │ Pass?   │
    └────┬────┘
         │
    ┌────┴────┐
    │         │
   Yes       No
    │         │
    │         ▼
    │    ┌──────────────┐
    │    │ 4. Clean     │  ← Reset before retry
    │    │    Again     │
    │    └──────┬───────┘
    │           │
    │           ▼
    │      (Loop back to Generate)
    │
    ▼
  Done

Key insight: The Clean Slate step happens before every verification attempt, not just at the start.

What State to Reset

1. Build Artifacts

# Before each verification
rm -rf dist/
rm -rf build/
rm -rf .next/
rm -rf out/

# Then build fresh
npm run build

Why: Stale build artifacts can mask missing files, broken imports, or type errors.

2. Node Modules Cache

# For critical verification (e.g., pre-deploy)
rm -rf node_modules/
npm ci  # Clean install from lockfile

Why: Cached modules can hide dependency issues or version mismatches.

3. TypeScript Cache

# Clear TS build info
rm -rf tsconfig.tsbuildinfo
rm -rf .tsbuildinfo

# Clear TS cache
rm -rf node_modules/.cache/

Why: Stale type cache causes false positives (“no errors” when errors exist).

4. Test Database

// Before each test suite
beforeAll(async () => {
  await database.reset();           // Drop all tables
  await database.runMigrations();   // Recreate schema
  await database.seed();            // Insert test fixtures
});

// After each test suite
afterAll(async () => {
  await database.disconnect();      // Close connections
  await database.destroy();         // Delete database
});

Why: Leftover data from previous tests causes false failures or successes.

5. Test Isolation

// Each test gets its own isolated state
beforeEach(async () => {
  // Reset database to clean state
  await database.truncateAll();
  await database.seed();
  
  // Reset mocks
  jest.clearAllMocks();
  
  // Reset environment
  process.env = { ...originalEnv };
});

Why: Tests should not depend on execution order.

6. File System State

# Remove generated files
rm -rf coverage/
rm -rf .jest-cache/
rm -rf tmp/
rm -rf uploads/

# Reset config files (if modified by tests)
git checkout -- .env.test

Why: Generated files or modified configs pollute subsequent runs.

7. Process State

// Kill all running servers before verification
afterEach(async () => {
  await server.close();           // Close HTTP server
  await redis.disconnect();       // Close Redis
  await queue.close();            // Close job queue
});

Why: Orphaned processes hold locks, ports, or connections.

8. Time-Based State

// Use fixed timestamps in tests
beforeAll(() => {
  jest.useFakeTimers();
  jest.setSystemTime(new Date('2025-01-01T00:00:00Z'));
});

afterAll(() => {
  jest.useRealTimers();
});

Why: Time-dependent code behaves differently on each run.

9. Random State

// Use deterministic random seed
beforeAll(() => {
  Math.random = seededRandom(12345);  // Fixed seed
});

Why: Random behavior makes tests flaky.

10. Cache Layers

// Clear all application caches
beforeEach(async () => {
  await cache.clear();              // Redis/Memcached
  await cdn.purge();                // CDN cache
  localStorage.clear();             // Browser storage
  sessionStorage.clear();           // Session storage
});

Why: Cached responses hide code changes.

Implementation Patterns

Pattern 1: Pre-Verification Reset Hook

Add a pre-verify script that resets all state:

{
  "scripts": {
    "pre-verify": "npm run clean && npm run reset-db && npm run clear-cache",
    "verify": "npm run pre-verify && npm run build && npm run test && npm run lint",
    "clean": "rm -rf dist/ build/ .next/ coverage/ tmp/",
    "reset-db": "npm run db:reset && npm run db:migrate && npm run db:seed",
    "clear-cache": "rm -rf node_modules/.cache/ tsconfig.tsbuildinfo"
  }
}

Usage:

# Every verification starts clean
$ npm run verify

# Equivalent to:
# 1. npm run pre-verify  (clean slate)
# 2. npm run build       (fresh build)
# 3. npm run test        (clean test)
# 4. npm run lint        (clean lint)

Pattern 2: CI/CD Clean Environment

Ensure CI always runs in clean environment:

# .github/workflows/verify.yml
name: Verify
on: [push, pull_request]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          clean: true  # Clean working directory
      
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      
      # Don't use cached node_modules for critical paths
      - run: npm ci  # Clean install
      
      # Run verification from clean slate
      - run: npm run build
      - run: npm run test
      - run: npm run lint
      
      # No caching of build artifacts between jobs
      # Each job starts fresh

Key: Every CI run is stateless (no artifacts from previous runs).

Pattern 3: Test Database Per Suite

import { createDatabase, destroyDatabase } from './test-utils';

describe('User API', () => {
  let db: Database;
  
  // Create fresh database for this suite
  beforeAll(async () => {
    db = await createDatabase();
    await db.migrate();
  });
  
  // Destroy database after suite
  afterAll(async () => {
    await destroyDatabase(db);
  });
  
  // Reset data before each test
  beforeEach(async () => {
    await db.truncateAll();
    await db.seed();
  });
  
  test('creates user', async () => {
    // Test runs with clean database
  });
});

Benefit: Each test suite is isolated from others.

Pattern 4: Docker for Reproducible Verification

# Dockerfile.verify
FROM node:20-alpine

WORKDIR /app

# Copy only package files first
COPY package*.json ./
RUN npm ci

# Copy source code
COPY . .

# Run verification
CMD ["npm", "run", "verify"]
# Every verification runs in clean container
$ docker build -f Dockerfile.verify -t app:verify .
$ docker run --rm app:verify

# Container is destroyed after run (no state persists)

Benefit: Perfectly reproducible environment every time.

Pattern 5: Stateless LLM Verification Loop

async function verifyWithLLM(code: string): Promise<VerifyResult> {
  // 1. Clean slate
  await resetEnvironment();
  
  // 2. Write code to disk
  await fs.writeFile('src/generated.ts', code);
  
  // 3. Run verification
  const buildResult = await runBuild();
  const testResult = await runTests();
  const lintResult = await runLint();
  
  // 4. Collect results
  const result = {
    success: buildResult.ok && testResult.ok && lintResult.ok,
    errors: [...buildResult.errors, ...testResult.errors, ...lintResult.errors],
  };
  
  // 5. Clean up (stateless)
  await resetEnvironment();
  
  return result;
}

async function resetEnvironment() {
  // Remove generated code
  await fs.rm('src/generated.ts', { force: true });
  
  // Clear build artifacts
  await fs.rm('dist/', { recursive: true, force: true });
  
  // Reset test database
  await database.reset();
  
  // Clear caches
  await cache.clear();
}

// Usage: Each LLM iteration starts clean
for (let attempt = 0; attempt < MAX_ATTEMPTS; attempt++) {
  const code = await llm.generate(prompt);
  const result = await verifyWithLLM(code);  // Stateless
  
  if (result.success) {
    break;  // Success
  } else {
    prompt = addErrors(prompt, result.errors);  // Try again
  }
}

Key: Each verification attempt is independent—no state leaks between attempts.

Best Practices

1. Make Reset Fast

If reset is slow, developers skip it:

# ✗ Slow reset (30 seconds)
$ rm -rf node_modules/ && npm install

# ✓ Fast reset (2 seconds)
$ npm run db:reset && npm run clean

Strategy: Reset only what’s necessary, not everything.

2. Verify the Reset

Add a “clean check” to catch incomplete resets:

#!/bin/bash
# check-clean.sh

if [ -d "dist/" ]; then
  echo "❌ Error: dist/ still exists after clean"
  exit 1
fi

if [ -f "tsconfig.tsbuildinfo" ]; then
  echo "❌ Error: TypeScript cache still exists"
  exit 1
fi

echo "✅ Environment is clean"

3. Document State Dependencies

Make it clear what state each verification depends on:

# Verification Requirements

## Build
- Requires: Clean `dist/` directory
- Produces: Compiled files in `dist/`
- Reset: `rm -rf dist/`

## Tests
- Requires: Clean test database
- Produces: Test coverage report
- Reset: `npm run db:reset`

## Lint
- Requires: Clean TypeScript cache
- Produces: Lint report
- Reset: `rm -rf tsconfig.tsbuildinfo`

4. Use Idempotent Operations

Operations should produce same result regardless of current state:

// ✗ Not idempotent (fails if table exists)
await db.createTable('users');

// ✓ Idempotent (always succeeds)
await db.createTableIfNotExists('users');

// ✗ Not idempotent (fails if file doesn't exist)
await fs.rm('dist/output.js');

// ✓ Idempotent (always succeeds)
await fs.rm('dist/output.js', { force: true });

5. Separate State from Logic

Keep state external to verification logic:

// ✗ State embedded in verifier
class Verifier {
  private cachedResults = new Map();  // State!
  
  verify(code: string) {
    if (this.cachedResults.has(code)) {
      return this.cachedResults.get(code);
    }
    // ...
  }
}

// ✓ Stateless verifier
class Verifier {
  verify(code: string, cache?: Map<string, Result>) {  // State passed in
    if (cache?.has(code)) {
      return cache.get(code);
    }
    // ...
  }
}

6. Test the Tests

Verify your tests are stateless:

# Run tests multiple times
$ npm test && npm test && npm test

# All runs should produce identical results
# If results differ, tests have state dependencies

Common Pitfalls

✗ Pitfall 1: “Clean Install is Too Slow”

Problem: npm ci takes 2 minutes, so developers skip it.

Solution: Use clean install only for critical verification (CI, pre-deploy):

{
  "scripts": {
    "verify:fast": "npm run build && npm run test",
    "verify:full": "npm ci && npm run build && npm run test"
  }
}

✗ Pitfall 2: “Forgot to Reset Database”

Problem: Tests pass locally, fail in CI because database has stale data.

Solution: Make database reset automatic:

// Global test setup (runs before all tests)
global.beforeAll(async () => {
  await database.reset();  // Always reset
});

✗ Pitfall 3: “Tests Depend on Execution Order”

Problem: Test A creates user, Test B expects user to exist.

Solution: Each test creates its own data:

// ✗ Test depends on previous test
test('get user', () => {
  const user = getUser(1);  // Assumes user 1 exists
  expect(user).toBeDefined();
});

// ✓ Test creates own data
test('get user', () => {
  const createdUser = createUser({ id: 1, email: '[email protected]' });
  const fetchedUser = getUser(1);
  expect(fetchedUser).toEqual(createdUser);
});

✗ Pitfall 4: “Cached Failure State”

Problem: Test fails once, then all subsequent runs fail even after fix.

Solution: Clear failure artifacts:

# Clear Jest cache
$ jest --clearCache

# Clear TypeScript cache
$ rm -rf tsconfig.tsbuildinfo

# Then run tests
$ npm test

✗ Pitfall 5: “Environment Variable Pollution”

Problem: Test sets process.env.NODE_ENV = 'test', affects other tests.

Solution: Reset environment after each test:

const originalEnv = process.env;

afterEach(() => {
  process.env = { ...originalEnv };  // Reset
});

Measuring Success

Track these metrics to verify statelessness:

1. Flaky Test Rate

Flaky test = Test that sometimes passes, sometimes fails (without code changes)

Target: 0% flaky tests

Stateless verification → Deterministic behavior → No flaky tests

2. Local vs. CI Pass Rate

Local pass rate: 100%
CI pass rate: 95%

5% gap = State differences between environments

Target: <1% gap

3. Consecutive Run Consistency

# Run tests 10 times
for i in {1..10}; do npm test; done

# Count failures
# Target: All 10 runs produce identical results

4. Time to Reproduce Issues

With state accumulation: "I can't reproduce this locally"
  → Hours of debugging environment differences

With stateless verification: "Failed in CI = Fails locally"
  → Minutes to reproduce and fix

Integration with AI Workflows

Pattern: Stateless Generate-Test-Fix Loop

async function llmGenerateWithVerification(
  prompt: string,
  maxAttempts: number = 5
): Promise<string> {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    // 1. Start with clean slate (STATELESS)
    await resetEnvironment();
    
    // 2. Generate code
    const code = await llm.generate(prompt);
    
    // 3. Verify (in clean environment)
    const result = await verify(code);
    
    if (result.success) {
      return code;  // Success!
    }
    
    // 4. Add errors to prompt for next attempt
    prompt = `${prompt}

Previous attempt failed:
${formatErrors(result.errors)}`;
    
    // Loop continues (next iteration starts clean)
  }
  
  throw new Error('Failed to generate valid code after ${maxAttempts} attempts');
}

Key insight: Each LLM attempt is independent. Attempt 3’s environment is identical to Attempt 1’s environment.

Pattern: Stateless CI/CD with LLM Auto-Fix

name: Verify and Auto-Fix

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      # 1. Clean checkout (no cache)
      - uses: actions/checkout@v4
        with:
          clean: true
      
      # 2. Clean install
      - run: npm ci
      
      # 3. Verify
      - id: verify
        run: npm run verify
        continue-on-error: true
      
      # 4. If failed, auto-fix (stateless)
      - if: steps.verify.outcome == 'failure'
        run: |
          # LLM analyzes failures
          claude-code fix --errors=$(npm run verify 2>&1)
          
          # Re-verify (fresh environment)
          npm run verify

Benefit: Auto-fix runs in same clean environment as original failure.

Related Patterns

Stateless Verification + Quality Gates

Each quality gate resets state:

Generate code[RESET] → Type check (clean TypeScript cache)
  ↓
[RESET] → Lint (clean lint cache)
  ↓
[RESET] → Test (clean test database)
  ↓
[RESET] → Build (clean dist/)
  ↓
Deploy

See: Quality Gates as Information Filters

Stateless Verification + Test-Based Regression Patching

Regression tests must be stateless:

// Regression test for bug #123
test('handles empty array without crashing', () => {
  // This test should ALWAYS pass (no state dependencies)
  const result = processArray([]);
  expect(result).toEqual([]);
});

See: Test-Based Regression Patching

Stateless Verification + Integration Testing

Integration tests need more state reset:

beforeEach(async () => {
  await database.reset();     // Clear database
  await cache.clear();        // Clear Redis
  await queue.clear();        // Clear job queue
  await cdn.purge();          // Clear CDN
  await restartServices();    // Restart all services
});

See: Integration Testing Patterns

Conclusion

Stateless verification is the foundation of reliable, reproducible AI-assisted development.

Key Principles:

  1. Reset state before each verification cycle—don’t accumulate artifacts
  2. Make reset fast—if it’s slow, developers skip it
  3. Test the reset—verify your clean slate is actually clean
  4. Isolate tests—each test should run independently
  5. Use idempotent operations—same result regardless of initial state
  6. Separate state from logic—keep verifiers stateless
  7. Measure flakiness—track consistency across runs

The Result:

  • Reproducible: Same code + same verification = same result (always)
  • Reliable: No false positives from stale state
  • Debuggable: “Failed in CI” = “Will fail locally” (no mystery)
  • Confident: Trust your verification because it’s deterministic

Without stateless verification: Tests are flaky, CI is unreliable, debugging is painful.

With stateless verification: Tests are deterministic, CI is trustworthy, bugs are reproducible.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course

The difference between “works on my machine” and “works everywhere” is stateless verification.

Related Concepts

References

Topics
Ci CdDeterminismIdempotencyQuality GatesReliabilityReproducibilityState ManagementStatelessTestingVerification

More Insights

Cover Image for Thought Leaders

Thought Leaders

People to follow for compound engineering, context engineering, and AI agent development.

James Phoenix
James Phoenix
Cover Image for Systems Thinking & Observability

Systems Thinking & Observability

Software should be treated as a measurable dynamical system, not as a collection of features.

James Phoenix
James Phoenix