The Same Endpoint Now Has Three Callers

James Phoenix
James Phoenix

In an AI-native product the same route is hit by a browser, an API key, and an agent. Caller identity stops being metadata and becomes a correctness axis that runs through auth, audit, authorization, and rate limiting all at once.

Author: James Phoenix | Date: July 2026


The Endpoint That Used to Have One Caller

For most of the history of web software, an endpoint had one kind of caller: a human in a browser session. You could reason about it that way and be right almost all the time. Authentication told you who they were, and everything downstream (the audit log, the authorization check, the rate limiter) quietly assumed a person on the other end who could read an error and slow down.

That assumption is now wrong. In OctoSpark the same createPost route gets hit three ways: a browser session (a human in the app), an API key (someone’s CLI or SDK script), and an MCP token (an agent calling a tool). The question that kicked this off was blunt: “how do you know if this is a user or an agent via MCP or CLI?” The uncomfortable answer was that the code did not know, because it hardcoded the caller to user everywhere and moved on.

Once the same endpoint serves humans, scripts, and agents, caller identity is no longer a label you stamp on a row. It is a cross-cutting correctness property, and it breaks four subsystems at the same time.

One endpoint, three callers: a browser session resolves to actor user, an API key to actor api_key, and an MCP token to actor agent, each derived from the one auth principal
One endpoint, three callers: a browser session resolves to actor user, an API key to actor api_key, and an MCP token to actor agent, each derived from the one auth principal

Auth and Audit: Derive the Caller Once, Lint the Literals

The first break is the audit log. Writing actor_type: 'user' into an audit event is not a cosmetic default, it is a lie that survives forever in your append-only history. When an agent publishes a post and the log says a user did it, accountability is gone the moment you need it.

The fix is small and structural. The auth principal already carries the discriminators, so I derived the actor in exactly one place: an apiKeyId means an API-key caller (CLI or SDK), an oauthMcpResource means an agent (MCP), and the absence of both means a session user. Then I added a lint rule that forbids hardcoding a caller-identity literal anywhere except the one allowed value, system.

The lint is the load-bearing part. Deriving the caller in one place is worthless if the next feature hardcodes user again, so the invariant has to be enforced by a rule, not by discipline. Agents are now a first-class actor class in the audit log, alongside humans and scripts, and they cannot silently disappear back into “user.”

Authorization: Agents Are IDOR Machines

First, the term, because the rest of this section leans on it. IDOR, an insecure direct object reference, is what happens when an endpoint looks up a record straight from a client-supplied ID and trusts that ID. Request /invoices/124 when yours was /invoices/123, and if nothing checks that the invoice actually belongs to you, the server hands it over. It is one of the most common access-control bugs on the web precisely because the happy path (take an ID, fetch the row, return it) looks completely normal until someone changes the number.

The second break is that an agent is the most dangerous ID supplier you will ever wire into that path. It is worse than a human or even a scripted API caller for two reasons: it hallucinates identifiers, and it can be prompt-injected into passing whatever ID an attacker planted in its context. Hand that to an endpoint that fetches an object and then checks ownership afterwards, and the IDOR fires by default.

The defence is to stop authorizing after the fetch. Bake the user identity into the WHERE clause so a scoped read physically cannot return another tenant’s row. A query that filters by the authenticated principal from the start has no “then compare” step to get wrong, no window where the wrong object is already loaded in memory. Fetch-then-compare trusts the ID; a scoped query distrusts it by construction.

Fixing IDOR: the fetch-then-compare pattern loads the row by id then checks the owner, versus a scoped query that adds AND user_id to the WHERE clause so the wrong row can never be returned
Fixing IDOR: the fetch-then-compare pattern loads the row by id then checks the owner, versus a scoped query that adds AND user_id to the WHERE clause so the wrong row can never be returned

This is also why passing a CurrentUser into a domain function is not duplicating your auth logic. Identity is not authorization. Requiring the caller identity as an argument is a declaration of requirement: it says this operation cannot be expressed without knowing who is asking, which is exactly the property you want when the asker might be a confused or hijacked agent.

Rate Limiting: Automations Cannot Click Slower

The third break is the rate limiter, and it is the one everyone gets wrong by applying a single policy uniformly. A free 429 with a Retry-After header is the correct, humane response for an interactive or human-paced caller. It protects you from abuse and cost runaway, and a person (or their browser) can simply wait and try again.

That same 429 is the wrong mechanism for a first-party automation. Recurrence jobs, scheduled content, and batch work originate inside Temporal, and an automation cannot click slower. Rejecting it does not shape load, it just converts a throughput problem into a pile of failed workflow activities that retry into the same wall.

So rate limiting has to be caller-class aware. Interactive callers get the free 429. First-party automations get shaped with queueing and concurrency caps instead of rejection. The same axis (who is calling) that decided the audit actor now decides the backpressure strategy. This is the applied edge of backpressure and of the Temporal execution mode: durable callers need to be slowed, not refused.

The Fourth Break: Validation Lives at a Boundary Agents Skip

There is a quieter fourth failure. To keep domains decoupled, the agent’s tool callbacks call the service facade directly through Effect rather than looping back through HTTP. That is architecturally reasonable, but it silently skips a boundary that was doing real work. In our case numeric bounds validation lived only in the contract schemas at the HTTP decode edge. The domain “verification” layer only gated capabilities on booleans and never saw the numeric values. So the agent path had no bounds checking at all, because it never crossed the edge where bounds checking lived.

The fix has two halves. Re-home the contract: pull the validation constants out of the HTTP layer so the in-process caller enforces the same bounds the HTTP route did. And make the human UI headless so the agent surface and the human composer both render the same component pointed at the same facade and the same validation constants. One source of truth, no drift, no second implementation that forgets a check. When you delete a transport, you have to relocate everything that transport was silently enforcing.

Leanpub Book

Read The Meta-Engineer

A practical book on building autonomous AI systems with Claude Code, context engineering, verification loops, and production harnesses.

Continuously updated
Claude Code + agentic systems
View Book

Caller Class Is the New Axis

Pull these together and the shape is clear. The moment an agent becomes a real caller, one property (which class of caller is this?) fans out into:

Subsystem Human session API key (CLI/SDK) Agent (MCP)
Audit actor user api_key agent
Authorization scoped query scoped query scoped query, higher IDOR risk
Rate limiting free 429 free 429 queue if first-party automation
Validation HTTP decode HTTP decode must re-home in-process

The discipline is to derive caller class from the auth principal in exactly one place, lint against every hardcoded identity literal, and then let each subsystem branch on that one derived value. A related move worth pairing with this: only stamp a trace as agent-originated on a positive, corroborated signal (a first-party CLI, the product /agent call site, or an MCP marker backed by the token principal), because a raw API request is genuinely ambiguous about whether a human or an agent made it.

This sits next to Tool Access Control, Prompt Injection Prevention, and Closed-Loop Agent Observability, but none of those treat the agent as a first-class caller whose arrival changes auth, audit, authorization, and rate limiting simultaneously.

Treat the agent as a first-class caller, or it will treat your endpoints as a first-class attack surface.

Topics
Agent ArchitectureApi DesignMCPSoftware Architecture

Newsletter

Become a better AI engineer

Weekly deep dives on production AI systems, context engineering, and the patterns that compound. No fluff, no tutorials. Just what works.

Join 306K+ developers. No spam. Unsubscribe anytime.


More Insights

Cover Image for Agents Broke the Economics of Your CI

Agents Broke the Economics of Your CI

Handing the commit button to agents does not just change how code gets written. It quietly rewrites your build cost model, and the multiplier is no longer you.

James Phoenix
James Phoenix
Cover Image for The Ops Tax Was the Real Cost of Self-Hosting

The Ops Tax Was the Real Cost of Self-Hosting

Self-hosting was never expensive because of hardware. It was expensive because of the operational labor. Agents just repriced that labor to near zero.

James Phoenix
James Phoenix