Coupling Analyzers Were Solved In 2003

James Phoenix
James Phoenix

Java and C# had topology-aware static analysis for twenty years. JavaScript skipped it. Then AI made the gap load-bearing.


The History Nobody Reads

Robert Martin’s Agile Software Development: Principles, Patterns, and Practices shipped in 2002. By 2003 every serious Java shop had heard the words afferent coupling, efferent coupling, and instability. The book gave them three numbers per package:

  • Ce (efferent): how many things this package depends on.
  • Ca (afferent): how many things depend on this package.
  • I (instability): Ce / (Ca + Ce). Zero is a stable foundation, one is an unstable leaf.

The rule was simple. Stable things should not depend on unstable things. If the metrics inverted, your architecture was lying.

The tools followed quickly. JDepend in 2000. NDepend for .NET in 2004. JArchitect, Structure101, Sonargraph, Lattix. ArchUnit landed in 2017 and turned the rules into JUnit tests so a CI pipeline could refuse a merge that crossed a context boundary. Patrick Smacchia spent a career on this. The .NET FxCop lineage rolled into Roslyn analyzers. Whole conferences were built on the idea that a codebase has a topology and that topology can be measured, gated, and regressed against.

This was twenty years ago. It is not new. It is not interesting. It is plumbing.

What JavaScript Did Instead

The JavaScript ecosystem skipped this entire conversation. We got linters that count semicolons. We got bundle analyzers that draw pretty graphs. We got madge for circular imports and dependency-cruiser for path rules. Useful, but binary. They tell you whether an edge exists, not whether the number of edges between two regions of your codebase has quietly tripled in a quarter.

Nobody on a Node project has ever said “our instability index drifted from 0.3 to 0.6 last sprint, we should look at why.” We do not even have the vocabulary in the room.

There are reasons. JavaScript was a scripting language for fifteen years before TypeScript made structural analysis tractable. The community optimised for ergonomics, not architecture. The bundler was the only tool that read the whole graph, and bundlers care about bytes shipped, not contexts respected. By the time the language could carry the analysis, the cultural moment had passed.

So we built domains and called them domains. We drew bounded contexts in slide decks. We wrote ADRs about how billing should not import from scheduling. And then we did not measure any of it, because measuring it required tooling nobody wrote.

Why AI Closes The Window On Pretending

A coupling analyzer was a nice-to-have when humans wrote the code. A senior engineer reviewing a PR would feel the wrongness of a fourteenth port being added between two contexts. The reviewer was the topology lint. Slow, biased, expensive, but real.

AI generation breaks that loop in a specific way. The model is excellent inside a single context. It writes fine functions, fine tests, fine types. It is not bad at code. It is blind to the topology between contexts because the topology is not in the file it is editing. The relevant context is the manifest, the import graph, the event registry, and the ten files in three other directories that already import from the one being changed.

So the model does what models do. It picks the closest available abstraction and uses it. If billing exposes a getInvoiceById port, the model will reach for it from scheduling rather than route the read through an event or a projection. Each individual call looks fine. The aggregate looks like billing and scheduling quietly merged into one context with two folders.

Types pass. Tests pass. The PR ships. The bounded context is now nominal.

The defining property of AI-introduced coupling is that it is legitimate. There is no rule violation. The import is allowed. The port is public. The call site is well-named. A structural lint that asks “is this import permitted?” cannot help. The only signal is the count, the direction, and the kind of edges between regions, measured over time.

This is exactly what NDepend has computed since 2004.

The Topological Linter

I have been writing one for an Effect-based DDD codebase. The framing I keep coming back to is topological linter. Conventional lints defend the inside of a function. Type checkers defend the inside of a file. Module-boundary lints (the enforce-domain-event-contracts script in our repo, for example) defend the inside of a context. None of them defend the shape between contexts, which is the only place AI-driven coupling actually accumulates.

What It Is

A static analyzer that loads a manifest of bounded contexts, walks every cross-context import in the codebase, classifies each one, and emits two artifacts: a JSON file for machines and a Markdown file with a Mermaid graph for humans. The same shape as NDepend, ported to ts-morph and Effect.

The manifest is the architecture written down. The tool refuses to infer contexts from folder structure, because if your contexts are not explicit, they do not exist:

{
  "contexts": [
    { "name": "billing",    "root": "src/billing",    "ports": "src/billing/ports/**/*.ts" },
    { "name": "scheduling", "root": "src/scheduling", "ports": "src/scheduling/ports/**/*.ts" },
    { "name": "identity",   "root": "src/identity",   "ports": "src/identity/ports/**/*.ts" }
  ],
  "thresholds": {
    "maxPortsPerPair": 5,
    "writeCouplingMultiplier": 2,
    "typeOnlyImportWeight": 0.25,
    "allowCycles": false
  }
}

Anything not declared is invisible. That is the point. You cannot measure boundaries you have not named.

How It Works

Three passes, all mechanical.

Pass 1: resolve every cross-context edge. ts-morph gives us the TypeScript compiler’s view of the program, including symbol resolution through barrel re-exports. The analyzer walks each source file in a context, follows every named import to its defining file (not the barrel it transited through), and records an edge if the source and target live in different contexts:

for (const decl of sourceFile.getImportDeclarations()) {
  for (const named of decl.getNamedImports()) {
    const symbol = named.getSymbol()?.getAliasedSymbol() ?? named.getSymbol()
    const definedIn = symbol?.getDeclarations()[0]?.getSourceFile().getFilePath()
    const fromCtx = contextOf(sourceFile.getFilePath())
    const toCtx   = contextOf(definedIn)
    if (fromCtx && toCtx && fromCtx !== toCtx) {
      edges.push({ from: fromCtx, to: toCtx, name: named.getName(), isTypeOnly: decl.isTypeOnly() })
    }
  }
}

getAliasedSymbol() is the part that makes this honest. Without it, a barrel that re-exports ten ports looks like a single edge. With it, ten ports are ten edges, attributed to the file that actually defines each one.

Pass 2: classify each edge. Method names tell you what kind of coupling you have. Reads are recoverable. Writes are not:

const READS  = /^(get|find|query|list|search|count|exists|has|fetch)/i
const WRITES = /^(create|update|delete|apply|record|set|add|insert|remove)/i

Type-only imports get a fractional weight because a shared type across a boundary is real but cheap. Imports of effect itself are filtered, so the runtime does not masquerade as coupling.

Pass 3: aggregate. The math is the 2003 math:

const Ce = uniqueDependencies(ctx).size       // efferent
const Ca = uniqueDependents(ctx).size         // afferent
const I  = Ce / (Ca + Ce || 1)                // instability
const score = writes * writeMultiplier + reads + typeOnly * typeOnlyWeight

Cycles via Tarjan’s SCC. Threshold violations are a simple comparison. There is nothing clever in the metrics file, and that is deliberate. Cleverness belongs in interpretation, not measurement.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated
Claude Code + agentic systems
View Book

Domain events get their own pass. A context emitting an event another context consumes is real coupling with zero import-graph signal, so the analyzer reads the event-tag registry ('billing.credits_purchased' etc.) and the worker’s switch statement that dispatches them, and emits kind: "event" edges that sit alongside the import edges in the same graph.

Why It Works

Because the output is dense enough for an agent to reason about and sparse enough for a human to scan. A pair entry in the JSON looks like:

{
  "from": "scheduling",
  "to": "billing",
  "ports": 7,
  "callSites": 23,
  "reads": 4,
  "writes": 18,
  "typeOnly": 1,
  "weightedScore": 40.25
}

The Markdown report renders the same data as a graph that GitHub can paint:

graph LR
  scheduling -- 7 ports --> billing
  scheduling -- 2 ports --> identity
  billing -. event .-> notifications

Neither artifact is the product. The skill that reads the JSON and explains it in domain language is the product. When asked “which of my boundaries look weakest right now,” the agent does not return numbers. It returns something like:

Scheduling and billing share seven ports, but eighteen of twenty-three call sites are writes. The Invoice aggregate is leaking into scheduling. A read coupling at this volume would be fixable with a projection. Write coupling at this volume means the boundary no longer exists. Either Invoice belongs in scheduling (merge), or there is a missing third context for the shared concept (extract).

That paragraph is the deliverable. The metrics are the substrate that lets the model write it accurately, with citations, every time. The analyzer is mechanical so the interpretation can be opinionated. The same separation that made NDepend useful in 2004 is what makes this useful in 2026: numbers you trust, judgement you direct.

Two design choices are worth defending. Write coupling weighs twice as much as read coupling, because reading across a boundary is solvable with a projection, while writing across a boundary means someone else’s aggregate is showing through your code. And the manifest is required, never inferred, because the alternative is the tool quietly disagreeing with your slide decks about where the contexts actually are.

What Twenty Years Bought Java That We Get For Free

The advantage of being late is that we get to skip the bad ideas. NDepend’s UI is a maximalist mess. Structure101’s licensing was painful enough to keep the tool out of small shops. ArchUnit’s DSL is verbose because Java was verbose. The metrics survived all of that. They are not the part anyone got wrong.

What we add is the AI loop. A 2004 coupling analyzer was a Friday-afternoon report nobody read. A 2026 coupling analyzer is a JSON file an agent re-reads on every PR, attached as context to the model that is about to write the next round of code. The metrics become a feedback signal in the same loop that produced the drift. The agent that introduced the coupling is the agent that explains it.

If the model is going to write inside the rooms, something has to defend the seams. Java had this. C# had this. We forgot. AI made forgetting expensive.

The fix is plumbing. It always was.


Related:

Topics
Afferent Efferent CouplingArchitecture TopologyCi PipelinesJdependStatic Analysis

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for Three Ways to Track Experiment Config Upstream

Three Ways to Track Experiment Config Upstream

If I cannot answer “what exactly was running when this experiment scored 0.74”, I do not have an experiment. I have an anecdote.

James Phoenix
James Phoenix
Cover Image for The Six Evergreen Levers of Agent Performance

The Six Evergreen Levers of Agent Performance

Whenever I am stuck in an agent loop chasing the same failure mode for the third time, the bug is rarely the agent. The bug is that I have not stepped back to ask which lever I should actually be pulling.

James Phoenix
James Phoenix