Systems Thinking & Observability

James Phoenix
James Phoenix

Software should be treated as a measurable dynamical system, not as a collection of features.


The Core Shift

You no longer see “shipping code” as the primary unit of progress. Instead, you see observability, feedback, and control as the real levers.

A repository is not just a place to store code. It is a harness that allows humans and agents to:

  • Perceive system state
  • Detect deviations
  • Intervene intelligently

Metrics, traces, logs, CI signals, and deployment feedback are not “nice to have”; they are the sensory organs that make higher-order automation possible.


Tests vs Telemetry: Physics and Behaviour

Tests = Behaviour (what should happen)

  • Unit tests for pure logic
  • Integration tests for service boundaries (DB, queues, Temporal, external APIs)
  • End-to-end tests for the “golden user flows”

OTEL = Physics (what did happen, under load, over time)

  • Trace propagation Vercel → Fastify → Temporal workers
  • Metrics for latency, error rates, retries, queue depth, workflow durations
  • Logs only as structured supplements (not your primary observability layer)

Tightening Testing Bounds with Telemetry

If OTEL Collector → Prometheus scrape/export interval is long, your feedback is coarse, so failures hide between samples.

Reducing that interval increases temporal resolution, so your tests can assert behaviour within tighter time bounds.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated
Claude Code + agentic systems
View Book

In engineering terms: You’re reducing measurement latency, which improves the stability and responsiveness of your verification loop.

Actionably:

  • Pick a “fast feedback” interval for test environments (tighter scrapes/export)
  • Keep production intervals sane for cost/noise
  • Ensure you can temporarily tighten during incident investigation or load tests

The Stack as Substrate (GCP + Vercel)

Layer Technology Purpose
UI + thin BFF Vercel/Next.js Frontend only
Domain gate GCP/Cloud Run + Fastify Writes, invariants, auth enforcement
Orchestration Temporal Multi-step / retryable / long-running
DB/auth/storage Supabase Service layer, not “backend brain”
IaC Terraform Reproducible environments, IAM, secrets

Rule: All meaningful writes go through Fastify (keeps invariants central and agent-safe).


Agents Need Environments, Not Prompts

An LLM with access to traces, metrics, logs, test results, and historical failures is categorically more capable than one operating blind.

This reframes engineering work: you are not just building features, you are constructing the epistemic environment in which agents operate.

Good infra is not overhead. It is intelligence amplification.


The AI/Math/Human Triangle

Role Best For
AI Generation, exploration, candidate production
Mathematics Constraints, optimisation, stability, guarantees
Humans Taste, trust boundaries, final approval

Systems that try to let AI “decide everything” become unstable.
Systems that constrain AI with math and human checkpoints become scalable.


Use Linters to Enforce Test Contracts

Use ESLint (or equivalent) to enforce that every module has the required test coverage and observability hooks.


Related

Mathematical Foundations

Topics
Control TheoryObservabilitySystems Thinking

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for Computer Use Kills the Config Tax, Not the Trust Tax

Computer Use Kills the Config Tax, Not the Trust Tax

My sister hates job applications because they make her re-submit information she already has. That is the same pain as API app review, and the same agent that lives in my codebase can dissolve both. This feels insane, and it is the new default shape of the work.

James Phoenix
James Phoenix
Cover Image for Sentry Errors Should Spawn Agents on Your Own Machine

Sentry Errors Should Spawn Agents on Your Own Machine

A new production error is an event. Events should trigger work, not sit in a dashboard. So I wired Sentry to spawn a coding agent on my own hardware, point it at my exact stack, and open a draft PR with a fix.

James Phoenix
James Phoenix