The single most important thing to understand about a model API is that it is stateless. The provider keeps no memory of your last request. Each call is processed in isolation, as if the model had never spoken to you before. There is no server-side conversation quietly accumulating on their end.
Every request starts from zero
This runs against the intuition a chat interface gives you. It feels like the model remembers the conversation, but it does not. What actually happens is that the whole history gets re-sent as context on every request. The "memory" you experience is the client resubmitting everything each time, not the model recalling anything.
Why this shapes everything downstream
Once statelessness clicks, a lot of agent design stops being mysterious:
- Context is mandatory, not optional. If a fact is not in this request, the model has no way to know it. Nothing carries over on its own.
- History has a running cost. Because a session re-sends its whole transcript every time, long conversations get steadily more expensive and slower.
- Memory has to be built. Anything that feels like the model "remembering" across sessions is machinery someone wrote to store and reload context. The model itself contributes nothing durable.
Related terms
Stateful
Stateful describes anything that keeps state across requests: conversation history, memory, a session. In an agent that job belongs to the harness or app, never to the stateless model API.
Read definition →Context
Context is all the text a model can see for a single request: the system prompt, your message, the conversation so far, and any files or tool output the agent has pulled in. It is the only thing the model knows about your specific situation.
Read definition →Session
A session is one continuous conversation with an agent that accumulates history in the context window. Resetting or ending it clears that history and starts the agent from a blank slate.
Read definition →