Summary
Custom tooling that doesn’t work creates cascading failures downstream. Like building a house on stilts, this article shows how to treat custom infrastructure (test utilities, CLIs, parsers, build scripts) as first-class code with comprehensive tests, preventing compound failures and increasing confidence in automation.
The Problem
Custom tooling (test utilities, CLIs, parsers, build scripts) often lacks tests. When these foundational tools break, everything built on top of them fails. LLMs generate code using broken infrastructure, creating a cascade of confusing failures. Teams spend hours debugging application code when the real issue is in the untested tooling layer. This is the “house on stilts” problem: unreliable foundations doom everything above them.
The Solution
Treat custom infrastructure as first-class code requiring comprehensive tests. Before writing application code that depends on custom tooling, write tests that verify the tooling works correctly. Use integration tests to validate end-to-end tool behavior. Apply the same quality standards to infrastructure code as to application code. This creates a reliable foundation that LLMs can confidently build upon.
Related Concepts
- Quality Gates as Information Filters – Tests as information filters that reduce state space
- Verification Sandwich Pattern – Establish baseline before and after code changes
- Integration Testing Patterns – Integration tests provide higher signal for LLM-generated code
- Test-Based Regression Patching – Write failing tests before fixing bugs
- Test-Driven Prompting – Write tests before generating code to constrain LLM output
- Property-Based Testing for LLM-Generated Code – Catch edge cases automatically with invariants
- Automated Flaky Test Detection – Diagnose intermittent test failures systematically
- Claude Code Hooks Quality Gates – Automate quality gates with hooks
References
- Vitest Documentation – Fast unit test framework perfect for testing custom infrastructure
- Node.js child_process Documentation – For testing CLI tools with exec/spawn

