Agent Capabilities: Tools and Eyes

James Phoenix
James Phoenix

Agents with more tools don’t just DO more—they DO better. Give them hands AND eyes.


The Core Insight

An agent’s capability is bounded by:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
  1. What it can DO (tools, CLIs, MCPs)
  2. What it can SEE (observability, linters, type checkers)

Expand both and the agent becomes dramatically more effective.

More tools = More actions possible
More eyes  = Better decisions about which actions to take

Two Dimensions of Capability

Hands: What It Can DO

Tool Type Examples Capability Unlocked
MCP Servers Supabase, Linear, GitHub Direct API access without boilerplate
Cloud CLIs gcloud, aws, az Infrastructure operations
Package managers bun, npm, pip Dependency management
Build tools tsc, esbuild, vite Compilation and bundling
Database CLIs psql, redis-cli Direct data operations

Eyes: What It Can SEE

Tool Type Examples Visibility Unlocked
Linters Biome, ESLint, Ruff Code quality issues
Type checkers tsc, mypy, pyright Type errors before runtime
OTEL/Tracing Jaeger, Honeycomb Runtime behavior
Test runners Jest, pytest, vitest What’s broken
Coverage tools c8, coverage.py What’s untested

MCP Servers: Superpowers

MCPs give agents direct access to external systems without writing integration code.

Supabase MCP

Agent can:
- Query tables directly
- Check RLS policies
- Inspect schema
- Debug auth issues with real data

Linear MCP

Agent can:
- Read issue context
- Update ticket status
- Link PRs to issues
- Understand project state

GitHub MCP

Agent can:
- Read PR comments
- Check CI status
- Review file changes
- Understand review feedback

The compound effect: Agent sees issue in Linear → reads context → checks related code → runs tests → updates PR → moves ticket. All in one flow.


Declare Available CLIs in CLAUDE.md

Agents don’t know what’s installed unless you tell them.

# CLAUDE.md

## Available CLIs

### Google Cloud
- Project: `my-project-id`
- `gcloud` is authenticated and configured
- Can run: `gcloud run deploy`, `gcloud pubsub`, `gcloud sql`

### AWS
- Profile: `default` (us-east-1)
- `aws` CLI is configured
- Can run: `aws s3`, `aws lambda`, `aws ecs`

### Supabase
- Project ref: `abcd1234`
- `supabase` CLI is linked
- Can run: `supabase db`, `supabase functions`

Now the agent knows it can:

# Deploy to Cloud Run
gcloud run deploy my-service --source .

# Check Pub/Sub messages
gcloud pubsub subscriptions pull my-sub --limit=10

# Query production database
supabase db dump --data-only | head -100

Without this declaration, the agent might write Python boto3 code instead of just running aws s3 cp.


Linters as Eyes

Linters give agents immediate feedback on code quality.

Setup in CLAUDE.md

## Code Quality Tools

- Linter: `biome check src/` (auto-fixes with `--apply`)
- Types: `tsc --noEmit`
- Tests: `bun test`

Run these before considering any change complete.

What the Agent Sees

$ biome check src/

src/api/handler.ts:45:12
  ✖ Avoid using `any` type

src/utils/parse.ts:23:1
  ✖ This function has too many parameters (6). Maximum is 4.

The agent now knows exactly what to fix, with file:line precision.


OTEL/Tracing as Eyes

Observability tools let agents see runtime behavior.

Setup

## Observability

- Jaeger UI: http://localhost:16686
- Can query traces: `curl localhost:16686/api/traces?service=my-service`
- Logs: `docker logs my-service --tail 100`

What the Agent Sees

Trace: POST /api/users
├─ middleware.auth: 2ms
├─ handler.createUser: 450ms   SLOW
  ├─ db.query: 12ms
  ├─ external.verify: 420ms   BOTTLENECK
  └─ db.insert: 8ms
└─ response: 1ms

Agent immediately knows the external verification call is the problem.


OTEL as Control Input (Not Just Eyes)

The next level: use telemetry as active feedback for automated optimization.

From Passive to Active

PASSIVE (Eyes):     Agent reads traces → Agent understands
ACTIVE (Control):   Agent reads traces → Evaluates constraints → Triggers fixes

Constraint-Driven Telemetry

# performance-constraints.yaml
constraints:
  latency:
    p99_max_ms: 100
    p90_max_ms: 50
  memory:
    max_mb: 300
    heap_growth_slope: 0  # No leaks
  errors:
    rate_max_percent: 0.1

actions:
  on_violation:
    - capture_detailed_trace
    - spawn_optimizer_agent

Automated Optimization Loop

async function telemetryControlLoop() {
  const metrics = await otel.query({
    service: 'my-service',
    window: '15m',
  });

  const violations = evaluateConstraints(metrics, constraints);

  if (violations.length > 0) {
    // Telemetry becomes control input
    const diagnosis = await agent.analyze(violations);
    const fix = await agent.proposeFix(diagnosis);
    await applyAndVerify(fix);
  }
}

The Control Theory View

        ┌─────────────────────────────────────┐
        │                                     │
        ▼                                     │
┌──────────────┐    ┌──────────────┐    ┌─────┴────────┐
│  Constraints │───▶│   Agent      │───▶│   Service    │
│  (Setpoints) │    │ (Controller) │    │   (Plant)    │
└──────────────┘    └──────────────┘    └──────────────┘
                           ▲                   │
                           │                   │
                    ┌──────┴───────┐           │
                    │    OTEL      │◀──────────┘
                    │   (Sensor)   │
                    └──────────────┘

Telemetry isn’t just visibility—it’s the sensor in a control loop.

See: Closed-Loop Telemetry-Driven Optimization


The Capability Stack

Layer your tools for maximum agent effectiveness:

┌─────────────────────────────────────────┐
│           MCP Servers                    │
│  (Supabase, Linear, GitHub, Slack)      │
├─────────────────────────────────────────┤
│           Cloud CLIs                     │
│  (gcloud, aws, az, supabase)            │
├─────────────────────────────────────────┤
│           Code Quality Eyes              │
│  (biome, tsc, mypy, eslint)             │
├─────────────────────────────────────────┤
│           Runtime Eyes                   │
│  (OTEL, logs, metrics, traces)          │
├─────────────────────────────────────────┤
│           Test Eyes                      │
│  (jest, pytest, coverage)               │
└─────────────────────────────────────────┘

Example: Full-Stack Agent

With all tools available:

# CLAUDE.md

## MCPs Available
- Supabase MCP (database access)
- Linear MCP (issue tracking)

## CLIs Available
- `gcloud` - Project: prod-project-123
- `supabase` - Linked to production

## Quality Tools
- `biome check --apply` - Lint and format
- `tsc --noEmit` - Type check
- `bun test` - Run tests

## Observability
- Jaeger: http://localhost:16686
- Logs: `gcloud logging read "resource.type=cloud_run_revision"`

Now the agent can:

  1. Read the Linear ticket for context
  2. Query Supabase to understand the data
  3. Write the fix
  4. Run linter and type checker
  5. Run tests
  6. Check traces for performance
  7. Deploy with gcloud
  8. Update the Linear ticket

All in one conversation.


Key Principle

Every tool you add is a capability multiplier. Every eye you add is a decision quality multiplier.

Don’t make agents guess or write boilerplate. Give them direct access.


Checklist: Agent Capability Audit

  • MCPs installed for external services?
  • Cloud CLIs declared in CLAUDE.md?
  • Project IDs/refs documented?
  • Linter configured and documented?
  • Type checker configured?
  • Test command documented?
  • Observability endpoints documented?
  • Log access commands documented?

Related

Topics
Agent Capability EnhancementAi AgentsCi CdDeveloper ToolsObservability Tools

More Insights

Cover Image for Thought Leaders

Thought Leaders

People to follow for compound engineering, context engineering, and AI agent development.

James Phoenix
James Phoenix
Cover Image for Systems Thinking & Observability

Systems Thinking & Observability

Software should be treated as a measurable dynamical system, not as a collection of features.

James Phoenix
James Phoenix