Agent Capabilities: Tools and Eyes

James Phoenix

Agents with more tools don’t just DO more—they DO better. Give them hands AND eyes.

The Core Insight

An agent’s capability is bounded by:

What it can DO (tools, CLIs, MCPs)
What it can SEE (observability, linters, type checkers)

Expand both and the agent becomes dramatically more effective.

More tools = More actions possible
More eyes  = Better decisions about which actions to take

Two Dimensions of Capability

Hands: What It Can DO

Tool Type	Examples	Capability Unlocked
MCP Servers	Supabase, Linear, GitHub	Direct API access without boilerplate
Cloud CLIs	`gcloud`, `aws`, `az`	Infrastructure operations
Package managers	`bun`, `npm`, `pip`	Dependency management
Build tools	`tsc`, `esbuild`, `vite`	Compilation and bundling
Database CLIs	`psql`, `redis-cli`	Direct data operations

Eyes: What It Can SEE

Tool Type	Examples	Visibility Unlocked
Linters	Biome, ESLint, Ruff	Code quality issues
Type checkers	`tsc`, `mypy`, `pyright`	Type errors before runtime
OTEL/Tracing	Jaeger, Honeycomb	Runtime behavior
Test runners	Jest, pytest, vitest	What’s broken
Coverage tools	c8, coverage.py	What’s untested

MCP Servers: Superpowers

MCPs give agents direct access to external systems without writing integration code.

Supabase MCP

Agent can:
- Query tables directly
- Check RLS policies
- Inspect schema
- Debug auth issues with real data

Linear MCP

Agent can:
- Read issue context
- Update ticket status
- Link PRs to issues
- Understand project state

GitHub MCP

Agent can:
- Read PR comments
- Check CI status
- Review file changes
- Understand review feedback

The compound effect: Agent sees issue in Linear → reads context → checks related code → runs tests → updates PR → moves ticket. All in one flow.

Declare Available CLIs in CLAUDE.md

Agents don’t know what’s installed unless you tell them.

# CLAUDE.md

## Available CLIs

### Google Cloud
- Project: `my-project-id`
- `gcloud` is authenticated and configured
- Can run: `gcloud run deploy`, `gcloud pubsub`, `gcloud sql`

### AWS
- Profile: `default` (us-east-1)
- `aws` CLI is configured
- Can run: `aws s3`, `aws lambda`, `aws ecs`

### Supabase
- Project ref: `abcd1234`
- `supabase` CLI is linked
- Can run: `supabase db`, `supabase functions`

Now the agent knows it can:

# Deploy to Cloud Run
gcloud run deploy my-service --source .

# Check Pub/Sub messages
gcloud pubsub subscriptions pull my-sub --limit=10

# Query production database
supabase db dump --data-only | head -100

Without this declaration, the agent might write Python boto3 code instead of just running aws s3 cp.

Linters as Eyes

Linters give agents immediate feedback on code quality.

Setup in CLAUDE.md

## Code Quality Tools

- Linter: `biome check src/` (auto-fixes with `--apply`)
- Types: `tsc --noEmit`
- Tests: `bun test`

Run these before considering any change complete.

What the Agent Sees

$ biome check src/

src/api/handler.ts:45:12
  ✖ Avoid using `any` type

src/utils/parse.ts:23:1
  ✖ This function has too many parameters (6). Maximum is 4.

The agent now knows exactly what to fix, with file:line precision.

OTEL/Tracing as Eyes

Observability tools let agents see runtime behavior.

Setup

## Observability

- Jaeger UI: http://localhost:16686
- Can query traces: `curl localhost:16686/api/traces?service=my-service`
- Logs: `docker logs my-service --tail 100`

What the Agent Sees

Trace: POST /api/users
├─ middleware.auth: 2ms
├─ handler.createUser: 450ms  ← SLOW
│  ├─ db.query: 12ms
│  ├─ external.verify: 420ms  ← BOTTLENECK
│  └─ db.insert: 8ms
└─ response: 1ms

Agent immediately knows the external verification call is the problem.

OTEL as Control Input (Not Just Eyes)

The next level: use telemetry as active feedback for automated optimization.

From Passive to Active

PASSIVE (Eyes):     Agent reads traces → Agent understands
ACTIVE (Control):   Agent reads traces → Evaluates constraints → Triggers fixes

Constraint-Driven Telemetry

# performance-constraints.yaml
constraints:
  latency:
    p99_max_ms: 100
    p90_max_ms: 50
  memory:
    max_mb: 300
    heap_growth_slope: 0  # No leaks
  errors:
    rate_max_percent: 0.1

actions:
  on_violation:
    - capture_detailed_trace
    - spawn_optimizer_agent

Automated Optimization Loop

async function telemetryControlLoop() {
  const metrics = await otel.query({
    service: 'my-service',
    window: '15m',
  });

  const violations = evaluateConstraints(metrics, constraints);

  if (violations.length > 0) {
    // Telemetry becomes control input
    const diagnosis = await agent.analyze(violations);
    const fix = await agent.proposeFix(diagnosis);
    await applyAndVerify(fix);
  }
}

The Control Theory View

        ┌─────────────────────────────────────┐
        │                                     │
        ▼                                     │
┌──────────────┐    ┌──────────────┐    ┌─────┴────────┐
│  Constraints │───▶│   Agent      │───▶│   Service    │
│  (Setpoints) │    │ (Controller) │    │   (Plant)    │
└──────────────┘    └──────────────┘    └──────────────┘
                           ▲                   │
                           │                   │
                    ┌──────┴───────┐           │
                    │    OTEL      │◀──────────┘
                    │   (Sensor)   │
                    └──────────────┘

Telemetry isn’t just visibility—it’s the sensor in a control loop.

See: Closed-Loop Telemetry-Driven Optimization

The Capability Stack

Layer your tools for maximum agent effectiveness:

┌─────────────────────────────────────────┐
│           MCP Servers                    │
│  (Supabase, Linear, GitHub, Slack)      │
├─────────────────────────────────────────┤
│           Cloud CLIs                     │
│  (gcloud, aws, az, supabase)            │
├─────────────────────────────────────────┤
│           Code Quality Eyes              │
│  (biome, tsc, mypy, eslint)             │
├─────────────────────────────────────────┤
│           Runtime Eyes                   │
│  (OTEL, logs, metrics, traces)          │
├─────────────────────────────────────────┤
│           Test Eyes                      │
│  (jest, pytest, coverage)               │
└─────────────────────────────────────────┘

Example: Full-Stack Agent

With all tools available:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

★ 4.5/5 rating

306,000+ learners

View Course

# CLAUDE.md

## MCPs Available
- Supabase MCP (database access)
- Linear MCP (issue tracking)

## CLIs Available
- `gcloud` - Project: prod-project-123
- `supabase` - Linked to production

## Quality Tools
- `biome check --apply` - Lint and format
- `tsc --noEmit` - Type check
- `bun test` - Run tests

## Observability
- Jaeger: http://localhost:16686
- Logs: `gcloud logging read "resource.type=cloud_run_revision"`

Now the agent can:

Read the Linear ticket for context
Query Supabase to understand the data
Write the fix
Run linter and type checker
Run tests
Check traces for performance
Deploy with gcloud
Update the Linear ticket

All in one conversation.

Key Principle

Every tool you add is a capability multiplier. Every eye you add is a decision quality multiplier.

Don’t make agents guess or write boilerplate. Give them direct access.

Checklist: Agent Capability Audit

MCPs installed for external services?
Cloud CLIs declared in CLAUDE.md?
Project IDs/refs documented?
Linter configured and documented?
Type checker configured?
Test command documented?
Observability endpoints documented?
Log access commands documented?

Building the Harness – The capability stack is part of Layer 2
Writing a Good CLAUDE.md – Where to declare tools
Context-Efficient Backpressure – Compress tool output
Agent-Native Architecture – Design systems where agents have full tool parity
Sub-Agent Architecture – Configure tools for specialized sub-agents
Agent Swarm Patterns – Multiple agents leveraging expanded capabilities

Agent Capabilities: Tools and Eyes

The Core Insight

Two Dimensions of Capability

Hands: What It Can DO

Eyes: What It Can SEE

MCP Servers: Superpowers

Supabase MCP

Linear MCP

GitHub MCP

Declare Available CLIs in CLAUDE.md

Linters as Eyes

Setup in CLAUDE.md

What the Agent Sees

OTEL/Tracing as Eyes

Setup

What the Agent Sees

OTEL as Control Input (Not Just Eyes)

From Passive to Active

Constraint-Driven Telemetry

Automated Optimization Loop

The Control Theory View

The Capability Stack

Example: Full-Stack Agent

Learn Prompt Engineering

Key Principle

Checklist: Agent Capability Audit

Related

More Insights

LLM VCR and Agent Trace Hierarchy: Deterministic Replay for Agent Pipelines

Agent Search Observation Loop: Learning What Context to Provide