Systems Thinking & Observability

James Phoenix

Software should be treated as a measurable dynamical system, not as a collection of features.

The Core Shift

You no longer see “shipping code” as the primary unit of progress. Instead, you see observability, feedback, and control as the real levers.

A repository is not just a place to store code. It is a harness that allows humans and agents to:

Perceive system state
Detect deviations
Intervene intelligently

Metrics, traces, logs, CI signals, and deployment feedback are not “nice to have”; they are the sensory organs that make higher-order automation possible.

Tests vs Telemetry: Physics and Behaviour

Tests = Behaviour (what should happen)

Unit tests for pure logic
Integration tests for service boundaries (DB, queues, Temporal, external APIs)
End-to-end tests for the “golden user flows”

OTEL = Physics (what did happen, under load, over time)

Trace propagation Vercel → Fastify → Temporal workers
Metrics for latency, error rates, retries, queue depth, workflow durations
Logs only as structured supplements (not your primary observability layer)

Tightening Testing Bounds with Telemetry

If OTEL Collector → Prometheus scrape/export interval is long, your feedback is coarse, so failures hide between samples.

Reducing that interval increases temporal resolution, so your tests can assert behaviour within tighter time bounds.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated

Claude Code + agentic systems

View Book

In engineering terms: You’re reducing measurement latency, which improves the stability and responsiveness of your verification loop.

Actionably:

Pick a “fast feedback” interval for test environments (tighter scrapes/export)
Keep production intervals sane for cost/noise
Ensure you can temporarily tighten during incident investigation or load tests

The Stack as Substrate (GCP + Vercel)

Layer	Technology	Purpose
UI + thin BFF	Vercel/Next.js	Frontend only
Domain gate	GCP/Cloud Run + Fastify	Writes, invariants, auth enforcement
Orchestration	Temporal	Multi-step / retryable / long-running
DB/auth/storage	Supabase	Service layer, not “backend brain”
IaC	Terraform	Reproducible environments, IAM, secrets

Rule: All meaningful writes go through Fastify (keeps invariants central and agent-safe).

Agents Need Environments, Not Prompts

An LLM with access to traces, metrics, logs, test results, and historical failures is categorically more capable than one operating blind.

This reframes engineering work: you are not just building features, you are constructing the epistemic environment in which agents operate.

Good infra is not overhead. It is intelligence amplification.

The AI/Math/Human Triangle

Role	Best For
AI	Generation, exploration, candidate production
Mathematics	Constraints, optimisation, stability, guarantees
Humans	Taste, trust boundaries, final approval

Systems that try to let AI “decide everything” become unstable.
Systems that constrain AI with math and human checkpoints become scalable.

Use Linters to Enforce Test Contracts

Use ESLint (or equivalent) to enforce that every module has the required test coverage and observability hooks.

Infrastructure Principles
Liquidation Cadence
Closed-Loop Telemetry Optimization – Telemetry as control input
Building the Harness – The meta-engineering layer

Mathematical Foundations

Control Theory – Feedback loops, stability, PID control
Optimisation – Objective functions, trade-offs, constraints
Probability – Decision-making under uncertainty