Agents with more tools don’t just DO more—they DO better. Give them hands AND eyes.
The Core Insight
An agent’s capability is bounded by:
- What it can DO (tools, CLIs, MCPs)
- What it can SEE (observability, linters, type checkers)
Expand both and the agent becomes dramatically more effective.
More tools = More actions possible
More eyes = Better decisions about which actions to take
Two Dimensions of Capability
Hands: What It Can DO
| Tool Type | Examples | Capability Unlocked |
|---|---|---|
| MCP Servers | Supabase, Linear, GitHub | Direct API access without boilerplate |
| Cloud CLIs | gcloud, aws, az |
Infrastructure operations |
| Package managers | bun, npm, pip |
Dependency management |
| Build tools | tsc, esbuild, vite |
Compilation and bundling |
| Database CLIs | psql, redis-cli |
Direct data operations |
Eyes: What It Can SEE
| Tool Type | Examples | Visibility Unlocked |
|---|---|---|
| Linters | Biome, ESLint, Ruff | Code quality issues |
| Type checkers | tsc, mypy, pyright |
Type errors before runtime |
| OTEL/Tracing | Jaeger, Honeycomb | Runtime behavior |
| Test runners | Jest, pytest, vitest | What’s broken |
| Coverage tools | c8, coverage.py | What’s untested |
MCP Servers: Superpowers
MCPs give agents direct access to external systems without writing integration code.
Supabase MCP
Agent can:
- Query tables directly
- Check RLS policies
- Inspect schema
- Debug auth issues with real data
Linear MCP
Agent can:
- Read issue context
- Update ticket status
- Link PRs to issues
- Understand project state
GitHub MCP
Agent can:
- Read PR comments
- Check CI status
- Review file changes
- Understand review feedback
The compound effect: Agent sees issue in Linear → reads context → checks related code → runs tests → updates PR → moves ticket. All in one flow.
Declare Available CLIs in CLAUDE.md
Agents don’t know what’s installed unless you tell them.
# CLAUDE.md
## Available CLIs
### Google Cloud
- Project: `my-project-id`
- `gcloud` is authenticated and configured
- Can run: `gcloud run deploy`, `gcloud pubsub`, `gcloud sql`
### AWS
- Profile: `default` (us-east-1)
- `aws` CLI is configured
- Can run: `aws s3`, `aws lambda`, `aws ecs`
### Supabase
- Project ref: `abcd1234`
- `supabase` CLI is linked
- Can run: `supabase db`, `supabase functions`
Now the agent knows it can:
# Deploy to Cloud Run
gcloud run deploy my-service --source .
# Check Pub/Sub messages
gcloud pubsub subscriptions pull my-sub --limit=10
# Query production database
supabase db dump --data-only | head -100
Without this declaration, the agent might write Python boto3 code instead of just running aws s3 cp.
Linters as Eyes
Linters give agents immediate feedback on code quality.
Setup in CLAUDE.md
## Code Quality Tools
- Linter: `biome check src/` (auto-fixes with `--apply`)
- Types: `tsc --noEmit`
- Tests: `bun test`
Run these before considering any change complete.
What the Agent Sees
$ biome check src/
src/api/handler.ts:45:12
✖ Avoid using `any` type
src/utils/parse.ts:23:1
✖ This function has too many parameters (6). Maximum is 4.
The agent now knows exactly what to fix, with file:line precision.
OTEL/Tracing as Eyes
Observability tools let agents see runtime behavior.
Setup
## Observability
- Jaeger UI: http://localhost:16686
- Can query traces: `curl localhost:16686/api/traces?service=my-service`
- Logs: `docker logs my-service --tail 100`
What the Agent Sees
Trace: POST /api/users
├─ middleware.auth: 2ms
├─ handler.createUser: 450ms ← SLOW
│ ├─ db.query: 12ms
│ ├─ external.verify: 420ms ← BOTTLENECK
│ └─ db.insert: 8ms
└─ response: 1ms
Agent immediately knows the external verification call is the problem.
OTEL as Control Input (Not Just Eyes)
The next level: use telemetry as active feedback for automated optimization.
From Passive to Active
PASSIVE (Eyes): Agent reads traces → Agent understands
ACTIVE (Control): Agent reads traces → Evaluates constraints → Triggers fixes
Constraint-Driven Telemetry
# performance-constraints.yaml
constraints:
latency:
p99_max_ms: 100
p90_max_ms: 50
memory:
max_mb: 300
heap_growth_slope: 0 # No leaks
errors:
rate_max_percent: 0.1
actions:
on_violation:
- capture_detailed_trace
- spawn_optimizer_agent
Automated Optimization Loop
async function telemetryControlLoop() {
const metrics = await otel.query({
service: 'my-service',
window: '15m',
});
const violations = evaluateConstraints(metrics, constraints);
if (violations.length > 0) {
// Telemetry becomes control input
const diagnosis = await agent.analyze(violations);
const fix = await agent.proposeFix(diagnosis);
await applyAndVerify(fix);
}
}
The Control Theory View
┌─────────────────────────────────────┐
│ │
▼ │
┌──────────────┐ ┌──────────────┐ ┌─────┴────────┐
│ Constraints │───▶│ Agent │───▶│ Service │
│ (Setpoints) │ │ (Controller) │ │ (Plant) │
└──────────────┘ └──────────────┘ └──────────────┘
▲ │
│ │
┌──────┴───────┐ │
│ OTEL │◀──────────┘
│ (Sensor) │
└──────────────┘
Telemetry isn’t just visibility—it’s the sensor in a control loop.
See: Closed-Loop Telemetry-Driven Optimization
The Capability Stack
Layer your tools for maximum agent effectiveness:
┌─────────────────────────────────────────┐
│ MCP Servers │
│ (Supabase, Linear, GitHub, Slack) │
├─────────────────────────────────────────┤
│ Cloud CLIs │
│ (gcloud, aws, az, supabase) │
├─────────────────────────────────────────┤
│ Code Quality Eyes │
│ (biome, tsc, mypy, eslint) │
├─────────────────────────────────────────┤
│ Runtime Eyes │
│ (OTEL, logs, metrics, traces) │
├─────────────────────────────────────────┤
│ Test Eyes │
│ (jest, pytest, coverage) │
└─────────────────────────────────────────┘
Example: Full-Stack Agent
With all tools available:
# CLAUDE.md
## MCPs Available
- Supabase MCP (database access)
- Linear MCP (issue tracking)
## CLIs Available
- `gcloud` - Project: prod-project-123
- `supabase` - Linked to production
## Quality Tools
- `biome check --apply` - Lint and format
- `tsc --noEmit` - Type check
- `bun test` - Run tests
## Observability
- Jaeger: http://localhost:16686
- Logs: `gcloud logging read "resource.type=cloud_run_revision"`
Now the agent can:
- Read the Linear ticket for context
- Query Supabase to understand the data
- Write the fix
- Run linter and type checker
- Run tests
- Check traces for performance
- Deploy with gcloud
- Update the Linear ticket
All in one conversation.
Key Principle
Every tool you add is a capability multiplier. Every eye you add is a decision quality multiplier.
Don’t make agents guess or write boilerplate. Give them direct access.
Checklist: Agent Capability Audit
- MCPs installed for external services?
- Cloud CLIs declared in CLAUDE.md?
- Project IDs/refs documented?
- Linter configured and documented?
- Type checker configured?
- Test command documented?
- Observability endpoints documented?
- Log access commands documented?
Related
- Building the Harness – The capability stack is part of Layer 2
- Writing a Good CLAUDE.md – Where to declare tools
- Context-Efficient Backpressure – Compress tool output
- Agent-Native Architecture – Design systems where agents have full tool parity
- Sub-Agent Architecture – Configure tools for specialized sub-agents
- Agent Swarm Patterns – Multiple agents leveraging expanded capabilities

