Build a Harness Is the New Reverse a Linked List

James Phoenix
James Phoenix

The interview question used to be “reverse a linked list.” Now it’s “build me a small agent harness.”

Author: James Phoenix | Date: March 2026

Source: @DionysianAgent on X


The Old Litmus Test

For two decades, the canonical interview question was “reverse a linked list.” Not because anyone reverses linked lists in production. Because it proved you understood pointers, memory, and how data structures actually work under the hood. The question was a proxy for foundational knowledge.

“What’s a primary key?” served the same purpose for backend engineers. You were not testing SQL syntax. You were testing whether someone understood relational data, uniqueness constraints, and why databases work the way they do.

These questions filtered for people who understood the machine they were operating.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course

The Machine Changed

The machine is no longer a compiler and a runtime. The machine is an LLM inside a loop, calling tools, accumulating context, and deciding what to do next. That is what Claude Code is. That is what Codex is. That is what every coding agent is.

If you use these tools daily but cannot explain what happens between your prompt and the code that appears on screen, you are in the same position as someone who writes Python but cannot explain what a pointer is. You can be productive. You cannot debug, extend, or reason about failure modes.

The new litmus test is: can you build a minimal agent harness from scratch?

Not a production system. Not a framework. A 200-line script that proves you understand the core loop.

What the Core Loop Actually Is

Every coding agent, from Claude Code to Cursor to Codex, runs the same fundamental loop. Strip away the UI, the streaming, the permission system, and the MCP integrations, and you get this:

1. Build a message array (system prompt + conversation history)
2. Call the LLM API with tools defined
3. LLM responds with either text or tool calls
4. If tool calls: execute them, append results to messages
5. Go to step 2
6. If text only: return the response. Done.

That is the entire architecture. Everything else is engineering on top of this loop. Here is what it looks like in code:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools = [
  {
    name: "read_file",
    description: "Read a file from disk",
    input_schema: {
      type: "object",
      properties: { path: { type: "string" } },
      required: ["path"],
    },
  },
  {
    name: "write_file",
    description: "Write content to a file",
    input_schema: {
      type: "object",
      properties: {
        path: { type: "string" },
        content: { type: "string" },
      },
      required: ["path", "content"],
    },
  },
  {
    name: "run_command",
    description: "Run a shell command",
    input_schema: {
      type: "object",
      properties: { command: { type: "string" } },
      required: ["command"],
    },
  },
];

async function executeTool(name: string, input: any): Promise<string> {
  switch (name) {
    case "read_file":
      return await Bun.file(input.path).text();
    case "write_file":
      await Bun.write(input.path, input.content);
      return "File written successfully.";
    case "run_command":
      const proc = Bun.spawn(["sh", "-c", input.command]);
      return await new Response(proc.stdout).text();
    default:
      return `Unknown tool: ${name}`;
  }
}

async function agentLoop(userMessage: string) {
  const messages: any[] = [{ role: "user", content: userMessage }];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      system: "You are a coding agent. Use tools to complete tasks.",
      tools,
      messages,
    });

    // Append the assistant's response
    messages.push({ role: "assistant", content: response.content });

    // If the model stopped because it wants to use tools
    if (response.stop_reason === "tool_use") {
      const toolResults = [];
      for (const block of response.content) {
        if (block.type === "tool_use") {
          console.log(`> ${block.name}(${JSON.stringify(block.input)})`);
          const result = await executeTool(block.name, block.input);
          toolResults.push({
            type: "tool_result",
            tool_use_id: block.id,
            content: result,
          });
        }
      }
      messages.push({ role: "user", content: toolResults });
    } else {
      // Model is done, print final response
      for (const block of response.content) {
        if (block.type === "text") console.log(block.text);
      }
      break;
    }
  }
}

agentLoop(process.argv[2] || "What files are in the current directory?");

That is ~80 lines. It reads files, writes files, and runs commands. It loops until the LLM decides it is done. This is a coding agent. Everything Claude Code adds on top (permissions, hooks, sub-agents, streaming, MCP, context compaction) is harness engineering around this exact loop.

Why Building It Teaches You Everything

@DionysianAgent made this point well: through building your own harness, you naturally encounter every problem that matters in AI engineering. Not because you set out to learn them, but because the loop forces you to confront them.

You build the loop above and it works for simple tasks. Then you try something harder, and you hit:

  • Context limits. Your messages array grows until the API rejects it. Now you need to learn about token budgets, context compaction, and sliding window strategies.
  • Tool failures. A shell command errors out. Now you need error handling, retry logic, and the decision of when to feed errors back to the LLM vs. when to bail out.
  • Hallucinated tool calls. The model invents a tool that does not exist. Now you need input validation and structured output constraints.
  • Runaway loops. The agent gets stuck retrying the same broken approach. Now you need max-iteration guards and escalation logic.
  • Memory across sessions. You close the terminal and lose everything. Now you need persistence, progress files, and session resumption.
  • Permissions. The agent runs rm -rf /. Now you need sandboxing, allowlists, and human-in-the-loop approval gates.

Each problem you hit maps directly to a production concern. You are not studying these concepts in the abstract. You are solving them because your harness broke and you need it to work.

What This Proves About a Candidate

When someone can build this minimal harness, they demonstrate understanding of:

  1. The LLM API contract. Messages, roles, tool definitions, stop reasons. The actual interface, not an abstraction over it.
  2. The agentic loop. Why it loops, when it terminates, how context accumulates.
  3. Tool execution. The boundary between what the LLM decides and what your code does. This is the most important architectural boundary in agent systems.
  4. State management. The messages array is the state. Everything derives from it. This is Factor 12 of the 12 Factor Agents: the agent as a stateless reducer over an event log.

Someone who cannot build this cannot reason about why their Claude Code session went sideways. They cannot debug a stuck agent. They cannot write effective CLAUDE.md files because they do not understand what consumes them.

The Linked List Parallel

Reversing a linked list tested whether you understood how the machine you were programming actually worked. Not the syntax. The machine.

Building a harness tests the same thing for the new machine. Not whether you can prompt an LLM. Whether you understand the loop, the tools, the context window, and the control flow that turns a language model into a software engineering agent.

The engineers who will thrive in the next decade are not the ones who use the best coding agent. They are the ones who understand what is happening inside it well enough to build, debug, and extend the harness around it.

You do not need to build your own production agent framework. But you should be able to build the 200-line version. That is the new minimum. That is the new linked list.


Related

Topics
Agent HarnessAi AgentsContext EngineeringInterview QuestionsLlm

More Insights

Cover Image for Throughput Inverts Merge Philosophy

Throughput Inverts Merge Philosophy

When agent throughput exceeds human review capacity, corrections become cheap and waiting becomes expensive. The merge strategy that was responsible at low throughput becomes the bottleneck at high throughput.

James Phoenix
James Phoenix
Cover Image for The Human Bottleneck Is a Quality Mechanism

The Human Bottleneck Is a Quality Mechanism

The speed limit humans impose on code production isn’t a limitation to overcome. It’s the mechanism that keeps codebases maintainable.

James Phoenix
James Phoenix