Ad-hoc Flows to Deterministic Scripts

James Phoenix

If you’re running the same agent flow regularly, convert it to a script. Deterministic beats probabilistic for known workflows.

The Pattern

Ad-hoc agent flow (used once) → Keep as conversation
Ad-hoc agent flow (used 3+ times) → Convert to script

When you find yourself prompting the same sequence repeatedly, that’s a signal: make it deterministic.

Why Convert?

Ad-hoc Agent Flow	Deterministic Script
Variable latency (LLM thinking)	Fast, predictable execution
Probabilistic (might do it differently)	Same behavior every time
Token cost per run	Zero LLM cost
Can deviate or get confused	Follows exact steps
Good for exploration	Good for repetition

The Conversion Process

Step 1: Identify Repeated Flows

Watch for patterns in your prompts:

“Run the tests, fix any failures, then lint”
“Deploy to staging, run smoke tests, notify Slack”
“Pull latest, rebase, run tests, push”

If you’ve typed it (or similar) 3+ times, it’s a candidate.

Step 2: Extract the Steps

Document what the agent actually does:

## Deploy to Staging Flow

1. Run `bun test`
2. If tests pass, run `bun build`
3. Run `gcloud run deploy staging --source .`
4. Run smoke test: `curl https://staging.example.com/health`
5. If healthy, post to Slack

Step 3: Convert to Script

#!/bin/bash
# scripts/deploy-staging.sh

set -e

echo "Running tests..."
bun test

echo "Building..."
bun build

echo "Deploying to staging..."
gcloud run deploy staging --source . --quiet

echo "Running smoke test..."
if curl -sf https://staging.example.com/health > /dev/null; then
    echo "✓ Staging healthy"
    curl -X POST "$SLACK_WEBHOOK" -d '{"text":"Staging deployed successfully"}'
else
    echo "✗ Staging health check failed"
    exit 1
fi

Step 4: Make It a Slash Command

# .claude/commands/deploy-staging.md
Run the staging deployment script:

\`\`\`bash
./scripts/deploy-staging.sh
\`\`\`

Report the outcome.

Now instead of explaining the flow, you just type /deploy-staging.

Examples

Before: Ad-hoc Test Fix Flow

User: Run the tests, if any fail, fix them, then run again until they pass
Agent: [runs tests, analyzes failures, makes fixes, re-runs...]

Problems:

Takes 30-60 seconds of LLM thinking per iteration
Agent might fix things incorrectly
Different approach each time

After: Deterministic Script

#!/bin/bash
# scripts/fix-tests.sh

MAX_ATTEMPTS=5
ATTEMPT=1

while [ $ATTEMPT -le $MAX_ATTEMPTS ]; do
    echo "Attempt $ATTEMPT of $MAX_ATTEMPTS"

    if bun test 2>&1 | tee test-output.txt; then
        echo "✓ All tests passing"
        rm test-output.txt
        exit 0
    fi

    # Extract failing test files
    FAILING=$(grep -E "FAIL.*\.test\." test-output.txt | awk '{print $2}')

    echo "Failing tests: $FAILING"
    echo "Manual intervention needed for: $FAILING"

    ATTEMPT=$((ATTEMPT + 1))
done

echo "✗ Tests still failing after $MAX_ATTEMPTS attempts"
exit 1

Then use agent only for the hard part:

User: /fix-tests failed. Here's the output. Fix these specific failures: [paste]

Hybrid Approach: Scripts + Agent Judgment

Some flows need both determinism AND judgment:

#!/bin/bash
# scripts/analyze-and-fix.sh

# Deterministic: gather data
echo "Gathering diagnostics..."
bun test 2>&1 > test-output.txt
bun run typecheck 2>&1 > type-output.txt
biome check src/ 2>&1 > lint-output.txt

# Deterministic: summarize
echo "=== Summary ==="
echo "Test failures: $(grep -c FAIL test-output.txt || echo 0)"
echo "Type errors: $(grep -c error type-output.txt || echo 0)"
echo "Lint issues: $(grep -c '✖' lint-output.txt || echo 0)"

# Output for agent to analyze
echo ""
echo "=== Details for Agent ==="
cat test-output.txt type-output.txt lint-output.txt

# .claude/commands/diagnose.md
Run the diagnostic script and analyze the output:

\`\`\`bash
./scripts/analyze-and-fix.sh
\`\`\`

Based on the output, prioritize issues and create a fix plan.

Best of both worlds:

Deterministic data gathering (fast, reliable)
Agent judgment on what to fix (intelligent)

When to Keep It Ad-hoc

Not everything should be scripted:

Keep Ad-hoc	Convert to Script
One-off exploration	Repeated workflows
Unknown steps	Known sequences
Needs judgment throughout	Mostly mechanical
Learning new codebase	Established patterns

The Latency Argument

Ad-hoc flow: 45 seconds (LLM reasoning + execution)
Script: 3 seconds (just execution)

Over 10 runs:

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated

Claude Code + agentic systems

View Book

Ad-hoc: 7.5 minutes
Script: 30 seconds

The savings compound. Plus, scripts don’t burn tokens.

Conversion Checklist

Identified a repeated flow (3+ times)
Documented the exact steps
Written a bash/python script
Created a slash command wrapper
Tested the script works reliably
Agent now uses script instead of improvising

Key Principle

Agents are for decisions. Scripts are for execution.

Use agents to figure out WHAT to do. Use scripts to DO it consistently.

Agent Capabilities – Give scripts to agents as tools
Context-Efficient Backpressure – Scripts for output compression
Building the Harness – Scripts are part of the harness