Fabricate The Telemetry Before The Traffic Exists

James Phoenix

You cannot validate a dashboard or an alert with zero traffic, so manufacture the traffic. Dashboards are a testable PromQL corpus, and synthetic telemetry is the integration test for your observability stack.

Author: James Phoenix | Date: June 2026

The Chicken-and-Egg That Kills Observability

Here is the trap that kills observability before launch. Your dashboards and alerts are written against metrics that only exist once real users hit the service. So you ship to staging, the panels are flat, the alerts are grey, and you tell yourself you will check them once there is traffic. Then there is traffic, something is on fire, and you discover the alert was wired to a metric name that never existed. The dashboard was never tested. It just looked tested because it rendered.

The fix is to stop waiting for traffic and manufacture it. Treat the observability layer as a system under test, with no real load, and prove it end to end before a single user arrives. There are three moves.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated

Claude Code + agentic systems

View Book

Dashboards Are a PromQL Corpus You Can Unit Test

Your dashboard JSON and your alert rules are full of PromQL expressions. Every one of them is a small program that can be syntactically wrong, reference a metric that does not exist, or use a label that was never emitted. You do not need a running Prometheus to catch most of that. You need promtool.

Pull every expression out of dashboards/*.json and alerts/*.json and check it offline:

# extract and lint every PromQL expression in the repo
promtool check rules alerts/*.yaml

# no local promtool? run the official image
docker run --rm -v "$PWD:/work" -w /work \
  prom/prometheus:latest promtool check rules alerts/*.yaml

The docker fallback matters. The whole point is that this runs anywhere, in CI, on a fresh machine, with nothing else stood up. A malformed expression or a typo’d metric name fails the check, and it fails before deploy instead of during an incident. Your dashboards just became a unit-testable corpus.

Round-Trip Synthetic Metrics Through the Real SDK

Linting the queries proves the queries are well-formed. It does not prove the names line up with what your code actually emits. For that you have to close the loop: emit fake metrics through the real recording path, then query them back and check the names match.

The discipline is to go through your production record* helpers and the real OpenTelemetry SDK, not a mock. If you emit through a fake, you have tested the fake. So you call the same recordOutboxDepth or recordPublishFailure helper the service uses, with synthetic values, let it travel through the actual SDK and exporter, and then read it back out of the collector and assert the metric name and labels are exactly what the dashboard expects.

// emit through the SAME helper production uses, not a stub
recordOutboxDepth({ queue: "publish", depth: 512 })
recordPublishFailure({ provider: "x", reason: "rate_limited" })

await forceMetricFlush()

// then query the collector back and assert the names/labels
// the dashboard's PromQL is looking for

If the dashboard queries outbox_depth and your helper emits outbox_backlog, you find out here, with one synthetic data point, instead of at 3am with a flat panel and a real backlog. This is an integration test for the seam between your recording code and your telemetry backend, a seam that is normally only exercised by production.

A Second Machine Is Your Traffic Generator

The last gap is the live staging collector itself. Linting and the local round-trip both run in-process. They do not prove that a metric emitted from somewhere else, over the wire, lands in the right panel on the staging stack. So generate that traffic from a different machine.

I keep a second Mac on the same LAN. I scp a small OTLP emitter to it, run it as a container pointed at the staging OTEL collector, and have it inject synthetic metrics over the network:

# copy a synthetic OTLP emitter to the second machine on the LAN
scp -r ./otlp-emitter jamesphoenix@Jamess-Mac-Studio.local:~/otlp-emitter

# run it there, pointed at the live staging collector
ssh jamesphoenix@Jamess-Mac-Studio.local \
  'docker run --rm --network host \
     -e OTEL_EXPORTER_OTLP_ENDPOINT=http://staging-collector:4317 \
     otlp-emitter'

Now the panels light up. Real network path, real collector, real ingestion, real dashboards, and still zero users. You are watching the exact wiring a launch would exercise, days before launch, from a machine that is not the one running the service. If a label gets dropped across the wire, or the collector is filtering by a resource attribute you forgot to set, the panel stays flat and you see it now.

The Principle

You do not get to skip testing a system just because the system is your monitoring. Observability is code, and untested code is broken code, you just have not run it yet. The three moves stack:

Lint the queries with promtool so no expression is malformed or references a missing metric.
Round-trip synthetic metrics through the real recording helpers and SDK so emitted names match queried names.
Inject from a second machine into the live staging collector so the full network path and the panels are exercised end to end.

Dashboards are a testable PromQL corpus. Synthetic telemetry is the integration test for your observability stack. Manufacture the traffic, and the first real incident is the second time your monitoring has run, not the first.

Fabricate The Telemetry Before The Traffic Exists

The Chicken-and-Egg That Kills Observability

Read The Meta-Engineer

Dashboards Are a PromQL Corpus You Can Unit Test

Round-Trip Synthetic Metrics Through the Real SDK

A Second Machine Is Your Traffic Generator

The Principle

Become a better AI engineer

More Insights

Measuring Coding Agent Leverage

Using DSL Languages for LLM Harnesses