A model on its own is a text-prediction engine with no hands. The harness is everything wrapped around it that turns predictions into an agent that can actually get work done. Think of it as the agent minus the model: the plumbing, the loop, and the guardrails.
What the harness handles
On every step, the harness is doing the unglamorous but essential work:
- Assembling the request. It gathers the system prompt, the history, the loaded files, and the tool definitions into a single request to send off.
- Running tools. When the model asks to read a file or run a command, the harness is what executes it and feeds the result back.
- Managing context. It decides what stays in the window, what gets trimmed, and what gets compacted as a session grows.
- Enforcing permissions. It is the layer that pauses to ask you before doing something risky, because the model only requests actions, it never runs them.
Why the distinction is worth keeping
People say "the AI wrote this code," but most of the behaviour you experience is the harness, not the model. Two tools using the identical model can feel completely different because their harnesses differ: how they curate context, which tools they expose, how aggressively they act. When an agent impresses or frustrates you, it is usually the harness at work.
Related terms
Agent
An agent is a language model wrapped in a loop that lets it call tools, read the results, and decide what to do next. The model supplies the judgement; the loop and the tools give it hands.
Read definition →Tool
A tool is a named action, with a typed input schema, that a model is allowed to call. Tools are how a model that can only produce text gets to actually do things: read a file, run a command, search the web.
Read definition →Model provider request
A model provider request is a single API call to the provider carrying the messages, tools, and settings for one step. It is the atomic unit of agent work, and one turn can be many requests.
Read definition →