Harness

A model on its own is a text-prediction engine with no hands. The harness is everything wrapped around it that turns predictions into an agent that can actually get work done. Think of it as the agent minus the model: the plumbing, the loop, and the guardrails.

What the harness handles

On every step, the harness is doing the unglamorous but essential work:

Assembling the request. It gathers the system prompt, the history, the loaded files, and the tool definitions into a single request to send off.
Running tools. When the model asks to read a file or run a command, the harness is what executes it and feeds the result back.
Managing context. It decides what stays in the window, what gets trimmed, and what gets compacted as a session grows.
Enforcing permissions. It is the layer that pauses to ask you before doing something risky, because the model only requests actions, it never runs them.

Why the distinction is worth keeping

People say "the AI wrote this code," but most of the behaviour you experience is the harness, not the model. Two tools using the identical model can feel completely different because their harnesses differ: how they curate context, which tools they expose, how aggressively they act. When an agent impresses or frustrates you, it is usually the harness at work.

Note

The same model can power a simple chat box and a full autonomous coding agent. The gap between those two experiences is almost entirely harness.

What the harness handles

Why the distinction is worth keeping

Related terms

Agent

Tool

Model provider request

Building with AI agents?