The Domain Glossary Is a Constraint, Not Documentation

James Phoenix

A glossary file at the repo root is the cheapest way I have found to stop an agent loop from quietly inventing a new vocabulary every iteration. Treat it as the spec the loop has to satisfy, not as docs for humans.

What I Noticed in Sandcastle

Matt Pocock’s sandcastle is a TypeScript library for orchestrating coding agents in sandboxes. Reading the commit log, most of the work is done by an agent loop. The .mailmap file aliases eleven different agent identities back to him. Recent commits read RALPH: implement X followed by RALPH: Review - <fix>. He is dogfooding the tool on itself.

What kept the codebase coherent across all of that is one file: CONTEXT.md. Twenty-three kilobytes, sitting at the repo root. It is not architecture documentation. It is a glossary, and it does three distinct jobs that I now think any non-trivial agent-driven repo needs.

Job One: Pin One Name Per Concept

Every term in CONTEXT.md has a definition followed by an explicit Avoid: line listing the synonyms that are not allowed. A few examples:

Sandbox: the isolation boundary around the agent. Avoid: “container” (too specific), “Docker sandbox” (ambiguous), “workspace”.
Iteration: a single invocation of the agent. Avoid: “run” (ambiguous with the JS function), “cycle”, “loop”.
Worktree: the git worktree on the host. Avoid: “workspace”, “branch copy”, “clone”.
Host: the developer’s machine. Avoid: “local” (ambiguous, the sandbox also has a local filesystem).

The Avoid: line is the part that does the work. Without it, every iteration of the loop reaches for whichever synonym its training distribution happens to prefer that day, and the codebase ends up with three different words for the same thing across types, error messages, log lines, and PR titles. With it, the synonyms are an explicit no-fly list. The agent has a deterministic answer to “what do I call this?”

This is older than agents. Eric Evans wrote about ubiquitous language in 2003 for the same reason: human teams drift on vocabulary too. What changes with an agent loop is the rate. A team of three drifts over months. A loop drifts inside a single afternoon.

Job Two: Encode Invariants the Agent Must Respect

Sandcastle’s CONTEXT.md has a long “Relationships” section. It reads like a list of facts, but most of those facts are invariants the loop is required to honour:

A no-sandbox provider is only accepted by interactive(), not run(). Enforced at the type level.
Isolated sandbox providers cannot use the head branch strategy. There is no host directory to write to directly.
Built-in prompt arguments cannot be overridden. Passing SOURCE_BRANCH or TARGET_BRANCH in promptArgs is an error.
Prompt argument substitution runs before prompt expansion, so prompt arguments can inject values into shell expressions.

These are not trivia. Each one is a rule the loop has broken at least once, and each one shows up somewhere downstream as a type constraint, a runtime error, or a test. The Relationships section is where I would now put any rule of the form “if you reach for X, also do Y.” It is cheaper to read than the type system, and it tells the agent the rule before the rule has a chance to fire.

Job Three: Resolve Overloaded Words

There is a short “Flagged ambiguities” section at the bottom. It calls out the words the project is forced to live with that mean two things, and forces you to qualify:

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated

Claude Code + agentic systems

View Book

“Provider” is overloaded. There is an agent provider and a sandbox provider. Never just “provider”.
“Run” can mean the JS run() function or one iteration. Use iteration for one agent invocation.
“Local” is ambiguous because the sandbox also has a local filesystem. Always use host.

I find this section the most underrated. Naming collisions are where agents make their most confident mistakes. If “provider” is unqualified in the prompt, the loop will pick whichever sense fits its current task and silently couple the two concepts. The ambiguity registry pulls those collisions into the open before the loop has a chance to collapse them.

Why This Is a Constraint, Not Documentation

The mistake I made the first time I tried to write one of these was treating it as a description of the system. I wrote what the code already did. The agent loop kept drifting anyway, because nothing about my prose was load-bearing on the next iteration.

The Sandcastle glossary works because it is written as a forward-looking constraint on the next iteration. Every term has an Avoid: clause that tells the agent what it is not allowed to do. Every relationship line is a rule the agent is supposed to enforce in code. The ambiguity registry is a list of words the agent is required to qualify.

When a new concept gets added to the system, the glossary is updated first. The implementation comes second. The loop reads the glossary, builds against the new definition, and the term enters the codebase consistent on its first appearance instead of being normalised after three rounds of cleanup.

How I Would Write One

The minimum viable shape is:

# <Project name>

A <one-sentence description of the system>.

## Language

### <Concept group, e.g. "Core concepts">

**<Term>**:
<One- or two-sentence definition.>
_Avoid_: "<synonym 1>", "<synonym 2>" (<reason if non-obvious>)

**<Next term>**:
...

## Relationships

- <Fact or invariant connecting two or more terms.>
- <Fact or invariant.>

## Flagged ambiguities

- **"<Word>"** -- Means X here, not Y. Use **<canonical term>** instead.

Two extra rules I would add from the start:

Ban a term as soon as you ban it. The Avoid: line should land in the glossary the moment you make the decision, not after the fifth PR rename. Otherwise the old name keeps getting reintroduced.
Update the glossary before you update the code. If the loop is going to add a concept this iteration, the glossary entry goes in first. The PR for the implementation should already cite the term as if it always existed.

Where This Sits In My Stack

This pairs cleanly with DDD bounded contexts for LLMs. Bounded contexts give you the partition. The glossary file is what stops names drifting within each partition. Across multiple contexts, the same pattern applies one level down: a CONTEXT.md per context, plus a top-level CONTEXT-MAP.md that points at each one.

It also pairs with ADRs for agent context. The glossary captures the vocabulary. The ADRs capture the decisions. When an ADR says “we rejected option B because of X,” the X is almost always a glossary term, and the rejection itself becomes a future invariant the loop needs to remember.

Key Takeaway

A glossary is the cheapest agent guardrail I know of. One file, written as a constraint rather than a description, with three sections: Avoid lists for naming, a Relationships section for invariants, and an ambiguity registry for overloaded words. The agent reads it on every iteration. The drift you would otherwise pay for in PR cleanup, you pay for once, in a markdown file at the repo root.

The naming stability is the visible benefit. The relationships section is what stops the loop from shipping a feature that compiles but breaks the safety story.