In an AI-native product the same route is hit by a browser, an API key, and an agent. Caller identity stops being metadata and becomes a correctness axis that runs through auth, audit, authorization, and rate limiting all at once.
Author: James Phoenix | Date: July 2026
The Endpoint That Used to Have One Caller
For most of the history of web software, an endpoint had one kind of caller: a human in a browser session. You could reason about it that way and be right almost all the time. Authentication told you who they were, and everything downstream (the audit log, the authorization check, the rate limiter) quietly assumed a person on the other end who could read an error and slow down.
That assumption is now wrong. In OctoSpark the same createPost route gets hit three ways: a browser session (a human in the app), an API key (someone’s CLI or SDK script), and an MCP token (an agent calling a tool). The question that kicked this off was blunt: “how do you know if this is a user or an agent via MCP or CLI?” The uncomfortable answer was that the code did not know, because it hardcoded the caller to user everywhere and moved on.
Once the same endpoint serves humans, scripts, and agents, caller identity is no longer a label you stamp on a row. It is a cross-cutting correctness property, and it breaks four subsystems at the same time.

Auth and Audit: Derive the Caller Once, Lint the Literals
The first break is the audit log. Writing actor_type: 'user' into an audit event is not a cosmetic default, it is a lie that survives forever in your append-only history. When an agent publishes a post and the log says a user did it, accountability is gone the moment you need it.
The fix is small and structural. The auth principal already carries the discriminators, so I derived the actor in exactly one place: an apiKeyId means an API-key caller (CLI or SDK), an oauthMcpResource means an agent (MCP), and the absence of both means a session user. Then I added a lint rule that forbids hardcoding a caller-identity literal anywhere except the one allowed value, system.
The lint is the load-bearing part. Deriving the caller in one place is worthless if the next feature hardcodes user again, so the invariant has to be enforced by a rule, not by discipline. Agents are now a first-class actor class in the audit log, alongside humans and scripts, and they cannot silently disappear back into “user.”
Authorization: Agents Are IDOR Machines
First, the term, because the rest of this section leans on it. IDOR, an insecure direct object reference, is what happens when an endpoint looks up a record straight from a client-supplied ID and trusts that ID. Request /invoices/124 when yours was /invoices/123, and if nothing checks that the invoice actually belongs to you, the server hands it over. It is one of the most common access-control bugs on the web precisely because the happy path (take an ID, fetch the row, return it) looks completely normal until someone changes the number.
The second break is that an agent is the most dangerous ID supplier you will ever wire into that path. It is worse than a human or even a scripted API caller for two reasons: it hallucinates identifiers, and it can be prompt-injected into passing whatever ID an attacker planted in its context. Hand that to an endpoint that fetches an object and then checks ownership afterwards, and the IDOR fires by default.
The defence is to stop authorizing after the fetch. Bake the user identity into the WHERE clause so a scoped read physically cannot return another tenant’s row. A query that filters by the authenticated principal from the start has no “then compare” step to get wrong, no window where the wrong object is already loaded in memory. Fetch-then-compare trusts the ID; a scoped query distrusts it by construction.

This is also why passing a CurrentUser into a domain function is not duplicating your auth logic. Identity is not authorization. Requiring the caller identity as an argument is a declaration of requirement: it says this operation cannot be expressed without knowing who is asking, which is exactly the property you want when the asker might be a confused or hijacked agent.
Rate Limiting: Automations Cannot Click Slower
The third break is the rate limiter, and it is the one everyone gets wrong by applying a single policy uniformly. A free 429 with a Retry-After header is the correct, humane response for an interactive or human-paced caller. It protects you from abuse and cost runaway, and a person (or their browser) can simply wait and try again.
That same 429 is the wrong mechanism for a first-party automation. Recurrence jobs, scheduled content, and batch work originate inside Temporal, and an automation cannot click slower. Rejecting it does not shape load, it just converts a throughput problem into a pile of failed workflow activities that retry into the same wall.
So rate limiting has to be caller-class aware. Interactive callers get the free 429. First-party automations get shaped with queueing and concurrency caps instead of rejection. The same axis (who is calling) that decided the audit actor now decides the backpressure strategy. This is the applied edge of backpressure and of the Temporal execution mode: durable callers need to be slowed, not refused.
The Fourth Break: Validation Lives at a Boundary Agents Skip
There is a quieter fourth failure. To keep domains decoupled, the agent’s tool callbacks call the service facade directly through Effect rather than looping back through HTTP. That is architecturally reasonable, but it silently skips a boundary that was doing real work. In our case numeric bounds validation lived only in the contract schemas at the HTTP decode edge. The domain “verification” layer only gated capabilities on booleans and never saw the numeric values. So the agent path had no bounds checking at all, because it never crossed the edge where bounds checking lived.
The fix has two halves. Re-home the contract: pull the validation constants out of the HTTP layer so the in-process caller enforces the same bounds the HTTP route did. And make the human UI headless so the agent surface and the human composer both render the same component pointed at the same facade and the same validation constants. One source of truth, no drift, no second implementation that forgets a check. When you delete a transport, you have to relocate everything that transport was silently enforcing.
Caller Class Is the New Axis
Pull these together and the shape is clear. The moment an agent becomes a real caller, one property (which class of caller is this?) fans out into:
| Subsystem | Human session | API key (CLI/SDK) | Agent (MCP) |
|---|---|---|---|
| Audit actor | user |
api_key |
agent |
| Authorization | scoped query | scoped query | scoped query, higher IDOR risk |
| Rate limiting | free 429 |
free 429 |
queue if first-party automation |
| Validation | HTTP decode | HTTP decode | must re-home in-process |
The discipline is to derive caller class from the auth principal in exactly one place, lint against every hardcoded identity literal, and then let each subsystem branch on that one derived value. A related move worth pairing with this: only stamp a trace as agent-originated on a positive, corroborated signal (a first-party CLI, the product /agent call site, or an MCP marker backed by the token principal), because a raw API request is genuinely ambiguous about whether a human or an agent made it.
This sits next to Tool Access Control, Prompt Injection Prevention, and Closed-Loop Agent Observability, but none of those treat the agent as a first-class caller whose arrival changes auth, audit, authorization, and rate limiting simultaneously.
Treat the agent as a first-class caller, or it will treat your endpoints as a first-class attack surface.

