Markdown Files as State Machines for AI Development Workflows

James Phoenix
James Phoenix

Summary

A structured markdown file can function as a reliable state machine for orchestrating multi-step AI development workflows. The key insight: prose instructions fail because LLMs treat them as suggestions. Numbered steps with explicit branching logic, checkpoint markers, and disk-persisted session state produce repeatable, context-resilient automation.

Source: Brad Feld, “Exploring /start: How a Markdown File Runs My Development Workflow” (2026-03-10).

The Problem

Early attempts at workflow automation via CLAUDE.md used descriptive prose (“fetch the ticket, then create a branch”). Claude treated these as loose guidance, not executable steps. The result was inconsistent behavior, especially after context compaction wiped in-memory state.

The Pattern: Structured Markdown as Execution Engine

Replace narrative prose with explicit decision trees and numbered steps. The markdown structure itself does what an interpreter would do in a traditional programming language.

Core Components

1. Numbered Steps with Preconditions

Each step has a defined trigger, action, and exit condition. Claude executes them sequentially rather than interpreting intent from paragraphs.

2. Decision Trees with Explicit Branching

Rather than “handle reopened tickets appropriately,” the markdown encodes branching logic: scan comments for signals like “bug,” “doesn’t work,” or “sent back,” then route to a Plan subagent with structured context highlighting the specific issues.

3. Checkpoint Markers Persisted to Disk

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated
Claude Code + agentic systems
View Book

Dual persistence survives context compaction:

  • .claude-session/TICKET-XXX.json tracks workflow state, current step, and session metadata
  • .claude-session/TICKET-XXX-plan.md stores implementation tasks as checkboxes (the canonical progress tracker)

When context compacts, Claude re-reads these files and resumes from the exact step and task.

4. Workflow Profiles

Project-specific behavior declared in CLAUDE.md removes hardcoded conditionals:

workflow:
  base_branch: preview
  quality_gates:
    - pnpm run type-check
    - pnpm run lint
  user_testing: required
  ship:
    method: pr

The /start command reads profiles at runtime and adapts accordingly.

Multi-Repository Routing

A YAML-based Team Registry maps ticket prefixes to repositories:

  • Standard routing: prefix maps to fixed directory
  • Heterogeneous routing: prompt user to select among options
  • Post-switch preflight: stash uncommitted changes before switching

Why This Works

The markdown structure constrains the LLM’s execution path the same way types constrain a compiler. Instead of hoping the model interprets prose correctly, you encode the workflow as a finite state machine where each state has defined transitions.

This connects to Making Invalid States Impossible: the numbered steps make it structurally difficult for the agent to skip or reorder operations.

Integration with Quality Gates

The workflow composes with the Superpowers methodology at specific checkpoints:

Step Gate Purpose
8 Planning Structured plan-writing standards
9 Approval Explicit sign-off before implementation
14.5 Verification Evidence of quality gate execution
15 Circuit breaker Block commits until manual user testing

Key Tradeoff

The markdown becomes less readable to humans and more reliable for Claude. This is the correct tradeoff for automation-first workflows. Human readability matters for authoring and debugging the state machine, not for runtime execution.

Connections

Actionable Takeaway

If you want an AI agent to do something complex and do it reliably, the answer is not better prose instructions. It is more structured ones. Encode workflows as numbered steps with explicit branching, persist state to disk at checkpoints, and design for context compaction as a guaranteed event, not an edge case.

Topics
AutomationClaude CodeContext ManagementDocumentationLong Running AgentsPrompt EngineeringState ManagementSystems Thinking

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for Parallel Worktrees Leak Dev Servers. Reap Them Idempotently.

Parallel Worktrees Leak Dev Servers. Reap Them Idempotently.

I ran my dev loop across a dozen git worktrees for months and never thought about what happened to a worktree’s servers when the worktree went away. The answer, it turned out, was nothing. They kept running. By the time I looked, I had more than sixty orphaned API servers and a handful of zombie Mintlify processes all pinned at 100% CPU, all polling the same Postgres and Temporal. The fix was not a smarter kill command. It was binding cleanup to the worktree lifecycle and making it idempotent at both ends.

James Phoenix
James Phoenix
Cover Image for When Parallelism Makes Tests Slower

When Parallelism Makes Tests Slower

*A thousand integration tests, sixteen cores, and not much faster. The fix was the easy part. The lasting lesson is that your test suite is the feedback loop your coding agents iterate against, and sc

James Phoenix
James Phoenix