Providers & requests

Harness

Also called: agent harness, scaffold

The harness is the code wrapped around a model that builds requests, runs tools, manages context, and enforces permissions. It is the agent minus the model, and it is where most of the real engineering lives.

James Phoenix
Understanding Data Updated July 2, 2026

A model on its own is a text-prediction engine with no hands. The harness is everything wrapped around it that turns predictions into an agent that can actually get work done. Think of it as the agent minus the model: the plumbing, the loop, and the guardrails.

What the harness handles

On every step, the harness is doing the unglamorous but essential work:

  • Assembling the request. It gathers the system prompt, the history, the loaded files, and the tool definitions into a single request to send off.
  • Running tools. When the model asks to read a file or run a command, the harness is what executes it and feeds the result back.
  • Managing context. It decides what stays in the window, what gets trimmed, and what gets compacted as a session grows.
  • Enforcing permissions. It is the layer that pauses to ask you before doing something risky, because the model only requests actions, it never runs them.

Why the distinction is worth keeping

People say "the AI wrote this code," but most of the behaviour you experience is the harness, not the model. Two tools using the identical model can feel completely different because their harnesses differ: how they curate context, which tools they expose, how aggressively they act. When an agent impresses or frustrates you, it is usually the harness at work.

Note
The same model can power a simple chat box and a full autonomous coding agent. The gap between those two experiences is almost entirely harness.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help