Context-Efficient Backpressure

James Phoenix
James Phoenix

Swallow all test/build/lint output and replace it with a single if the stage passes. If exitCode != 0, dump the stashed output.


The Problem

Standard test runs generate 200+ lines of output
Standard test runs generate 200+ lines of output

When tests pass, developers waste 2-3% of context window conveying results that require fewer than 10 tokens to communicate.


Stay in the Smart Zone

Claude models perform optimally within approximately 75k tokens. Beyond this, you enter the “dumb zone” where:

  • Agents miss obvious errors
  • Instructions get ignored
  • Human intervention costs 10x more than token savings

Human time managing agents in a “dumb zone” costs roughly 10x more than token economy concerns.


The Fix: run_silent()

Wrapper intercepts and conditionally displays output
Wrapper intercepts and conditionally displays output
run_silent() {
    local description="$1"
    local command="$2"
    local tmp_file=$(mktemp)

    if eval "$command" > "$tmp_file" 2>&1; then
        printf "  ✓ %s\n" "$description"
        rm -f "$tmp_file"
        return 0
    else
        local exit_code=$?
        printf "  ✗ %s\n" "$description"
        cat "$tmp_file"
        rm -f "$tmp_file"
        return $exit_code
    fi
}

# Usage
run_silent "Auth tests" "pytest tests/auth/"
run_silent "Utils tests" "pytest tests/utils/"
run_silent "API tests" "pytest tests/api/"

Output on success:

✓ Auth tests
✓ Utils tests
✓ API tests

Output on failure:

 Auth tests
 Utils tests
 API tests

FAIL src/api/users.test.ts
 should validate email format
  Expected: true
  Received: false
Token comparison: verbose vs compressed output
Token comparison: verbose vs compressed output

Progressive Refinement

Enable failFast

Process one failure at a time instead of dumping all errors:

# Python
pytest -x tests/

# JavaScript
jest --bail

# Go
go test -failfast ./...

# Vitest
vitest run --bail 1

Filter Irrelevant Output

Strip noise using standard Unix tools:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
run_silent_filtered() {
    local description="$1"
    local command="$2"
    local filter="${3:-cat}"  # Default: no filter
    local tmp_file=$(mktemp)

    if eval "$command" 2>&1 | eval "$filter" > "$tmp_file"; then
        printf "  ✓ %s\n" "$description"
        rm -f "$tmp_file"
        return 0
    else
        local exit_code=$?
        printf "  ✗ %s\n" "$description"
        cat "$tmp_file"
        rm -f "$tmp_file"
        return $exit_code
    fi
}

# Filter out timing info and stack frames from node_modules
run_silent_filtered "API tests" \
    "jest tests/api" \
    "grep -v 'node_modules' | grep -v 'Time:'"

Framework-Specific Parsing

Extract only relevant info from test output:

# Pytest: show only test counts on success
pytest_summary() {
    pytest "$@" 2>&1 | tail -1 | grep -E "passed|failed|error"
}

# Jest: extract failure details
jest_failures() {
    jest "$@" 2>&1 | grep -A 5 "FAIL\|●"
}

# Go: show only failures
go_test_failures() {
    go test "$@" 2>&1 | grep -E "FAIL|---"
}

Anti-Patterns: What Models Do Wrong

Output Swallowing

Models sometimes use MORE tokens trying to compress
Models sometimes use MORE tokens trying to compress
# Bad: Model pipes to /dev/null then describes
npm test > /dev/null 2>&1 && echo "Tests passed" || echo "Tests failed"
# Model then writes: "I ran the tests and they passed. All 47 test suites..."
# This uses MORE tokens than just showing the output!

Piping to head/tail

# Bad: Model truncates to save tokens
npm test 2>&1 | head -n 50

# Problem: Agent may need to re-run 5-minute test suite
# when the relevant error was at line 51

Conservative model behavior burns more tokens, human time, and cognitive energy.


Full Implementation

A complete test runner with backpressure:

#!/bin/bash
# scripts/test.sh - Context-efficient test runner

set -e

RED='\033[0;31m'
GREEN='\033[0;32m'
NC='\033[0m'

run_silent() {
    local description="$1"
    local command="$2"
    local tmp_file=$(mktemp)

    if eval "$command" > "$tmp_file" 2>&1; then
        printf "${GREEN}${NC} %s\n" "$description"
        rm -f "$tmp_file"
        return 0
    else
        local exit_code=$?
        printf "${RED}${NC} %s\n" "$description"
        echo ""
        cat "$tmp_file"
        rm -f "$tmp_file"
        return $exit_code
    fi
}

echo "Running test suite..."
echo ""

run_silent "Type check" "tsc --noEmit"
run_silent "Lint" "eslint src/ --max-warnings 0"
run_silent "Unit tests" "jest --bail"
run_silent "Integration tests" "jest --config jest.integration.config.js --bail"

echo ""
echo "All checks passed!"

TypeScript Version

For programmatic use:

import { spawn } from "child_process";

interface RunResult {
  success: boolean;
  output: string;
  exitCode: number;
}

async function runSilent(
  description: string,
  command: string,
  args: string[]
): Promise<RunResult> {
  return new Promise((resolve) => {
    const chunks: Buffer[] = [];
    const proc = spawn(command, args, { shell: true });

    proc.stdout.on("data", (data) => chunks.push(data));
    proc.stderr.on("data", (data) => chunks.push(data));

    proc.on("close", (exitCode) => {
      const output = Buffer.concat(chunks).toString();

      if (exitCode === 0) {
        console.log(`  ✓ ${description}`);
        resolve({ success: true, output: "", exitCode: 0 });
      } else {
        console.log(`  ✗ ${description}`);
        console.log(output);
        resolve({ success: false, output, exitCode: exitCode ?? 1 });
      }
    });
  });
}

// Usage in agent tooling
async function runTests(): Promise<string> {
  const results: RunResult[] = [];

  results.push(await runSilent("Type check", "npx", ["tsc", "--noEmit"]));
  results.push(await runSilent("Lint", "npx", ["eslint", "src/"]));
  results.push(await runSilent("Tests", "npx", ["jest", "--bail"]));

  const failed = results.filter((r) => !r.success);

  if (failed.length === 0) {
    return "All checks passed ✓";
  }

  return failed.map((r) => r.output).join("\n");
}

Key Principle

If you already know what matters, don’t leave it to a model to churn through 1000s of junk tokens to decide.

Deterministic output beats non-deterministic parsing.


Related

Topics
Ci Cd EfficiencyContext EngineeringDeveloper ExperienceOutput Management

More Insights

Cover Image for Thought Leaders

Thought Leaders

People to follow for compound engineering, context engineering, and AI agent development.

James Phoenix
James Phoenix
Cover Image for Systems Thinking & Observability

Systems Thinking & Observability

Software should be treated as a measurable dynamical system, not as a collection of features.

James Phoenix
James Phoenix