Test Custom Infrastructure: Avoiding the House on Stilts

James Phoenix
James Phoenix

Summary

Custom tooling that doesn’t work creates cascading failures downstream. Like building a house on stilts, this article shows how to treat custom infrastructure (test utilities, CLIs, parsers, build scripts) as first-class code with comprehensive tests, preventing compound failures and increasing confidence in automation.

The Problem

Custom tooling (test utilities, CLIs, parsers, build scripts) often lacks tests. When these foundational tools break, everything built on top of them fails. LLMs generate code using broken infrastructure, creating a cascade of confusing failures. Teams spend hours debugging application code when the real issue is in the untested tooling layer. This is the “house on stilts” problem: unreliable foundations doom everything above them.

The Solution

Treat custom infrastructure as first-class code requiring comprehensive tests. Before writing application code that depends on custom tooling, write tests that verify the tooling works correctly. Use integration tests to validate end-to-end tool behavior. Apply the same quality standards to infrastructure code as to application code. This creates a reliable foundation that LLMs can confidently build upon.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated
Claude Code + agentic systems
View Book

Related Concepts

References

Topics
Agent ReliabilityCi CdDeveloper ExperienceQuality GatesTesting

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for Computer Use Kills the Config Tax, Not the Trust Tax

Computer Use Kills the Config Tax, Not the Trust Tax

My sister hates job applications because they make her re-submit information she already has. That is the same pain as API app review, and the same agent that lives in my codebase can dissolve both. This feels insane, and it is the new default shape of the work.

James Phoenix
James Phoenix
Cover Image for Sentry Errors Should Spawn Agents on Your Own Machine

Sentry Errors Should Spawn Agents on Your Own Machine

A new production error is an event. Events should trigger work, not sit in a dashboard. So I wired Sentry to spawn a coding agent on my own hardware, point it at my exact stack, and open a draft PR with a fix.

James Phoenix
James Phoenix