Software should be treated as a measurable dynamical system, not as a collection of features.
The Core Shift
You no longer see “shipping code” as the primary unit of progress. Instead, you see observability, feedback, and control as the real levers.
A repository is not just a place to store code. It is a harness that allows humans and agents to:
- Perceive system state
- Detect deviations
- Intervene intelligently
Metrics, traces, logs, CI signals, and deployment feedback are not “nice to have”; they are the sensory organs that make higher-order automation possible.
Tests vs Telemetry: Physics and Behaviour
Tests = Behaviour (what should happen)
- Unit tests for pure logic
- Integration tests for service boundaries (DB, queues, Temporal, external APIs)
- End-to-end tests for the “golden user flows”
OTEL = Physics (what did happen, under load, over time)
- Trace propagation Vercel → Fastify → Temporal workers
- Metrics for latency, error rates, retries, queue depth, workflow durations
- Logs only as structured supplements (not your primary observability layer)
Tightening Testing Bounds with Telemetry
If OTEL Collector → Prometheus scrape/export interval is long, your feedback is coarse, so failures hide between samples.
Reducing that interval increases temporal resolution, so your tests can assert behaviour within tighter time bounds.
In engineering terms: You’re reducing measurement latency, which improves the stability and responsiveness of your verification loop.
Actionably:
- Pick a “fast feedback” interval for test environments (tighter scrapes/export)
- Keep production intervals sane for cost/noise
- Ensure you can temporarily tighten during incident investigation or load tests
The Stack as Substrate (GCP + Vercel)
| Layer | Technology | Purpose |
|---|---|---|
| UI + thin BFF | Vercel/Next.js | Frontend only |
| Domain gate | GCP/Cloud Run + Fastify | Writes, invariants, auth enforcement |
| Orchestration | Temporal | Multi-step / retryable / long-running |
| DB/auth/storage | Supabase | Service layer, not “backend brain” |
| IaC | Terraform | Reproducible environments, IAM, secrets |
Rule: All meaningful writes go through Fastify (keeps invariants central and agent-safe).
Agents Need Environments, Not Prompts
An LLM with access to traces, metrics, logs, test results, and historical failures is categorically more capable than one operating blind.
This reframes engineering work: you are not just building features, you are constructing the epistemic environment in which agents operate.
Good infra is not overhead. It is intelligence amplification.
The AI/Math/Human Triangle
| Role | Best For |
|---|---|
| AI | Generation, exploration, candidate production |
| Mathematics | Constraints, optimisation, stability, guarantees |
| Humans | Taste, trust boundaries, final approval |
Systems that try to let AI “decide everything” become unstable.
Systems that constrain AI with math and human checkpoints become scalable.
Use Linters to Enforce Test Contracts
Use ESLint (or equivalent) to enforce that every module has the required test coverage and observability hooks.
Related
- Infrastructure Principles
- Liquidation Cadence
- Closed-Loop Telemetry Optimization – Telemetry as control input
- Building the Harness – The meta-engineering layer
Mathematical Foundations
- Control Theory – Feedback loops, stability, PID control
- Optimisation – Objective functions, trade-offs, constraints
- Probability – Decision-making under uncertainty

