Tool Call Validation: JSON Schema Validation for Tool Outputs

James Phoenix
James Phoenix

Summary

Tool call validation enforces type safety at the boundary between LLM outputs and deterministic code. When an agent returns a tool call, the response is a JSON object with a tool name and parameters. This JSON must be validated before execution because LLMs can return malformed JSON, missing fields, invalid types, or values outside acceptable ranges. Runtime schema validation with libraries like Zod transforms untrusted LLM output into typed, validated data that your code can safely execute.

The Problem

LLM-generated tool calls are fundamentally untrusted input. Even well-prompted models produce invalid outputs under certain conditions.

Malformed JSON Structure

LLMs sometimes produce syntactically invalid JSON:

// Expected
{ "tool": "create_user", "parameters": { "name": "Alice", "email": "[email protected]" } }

// Actual (LLM hallucinated extra comma)
{ "tool": "create_user", "parameters": { "name": "Alice", "email": "[email protected]", } }

// Actual (unclosed string)
{ "tool": "create_user", "parameters": { "name": "Alice, "email": "alice@example.com" } }

Without validation, JSON.parse() throws and your agent crashes.

Missing Required Fields

Models omit required fields when context is ambiguous:

// Expected
{
  "tool": "transfer_funds",
  "parameters": {
    "from_account": "checking",
    "to_account": "savings",
    "amount": 500,
    "currency": "USD"
  }
}

// Actual (currency missing - defaults cause bugs)
{
  "tool": "transfer_funds",
  "parameters": {
    "from_account": "checking",
    "to_account": "savings",
    "amount": 500
  }
}

If your code assumes currency exists, you get undefined comparisons or incorrect defaults.

Wrong Types

Type coercion causes subtle bugs:

// Expected: amount as number
{ "tool": "charge_customer", "parameters": { "amount": 1999 } }

// Actual: amount as string (LLM extracted from text)
{ "tool": "charge_customer", "parameters": { "amount": "1999" } }

// Bug: 1999 * 1.1 = 2198.9 (correct)
//      "1999" * 1.1 = NaN or "19991.1" depending on operation

JavaScript’s weak typing masks these errors until production.

Invalid Values

Values outside acceptable ranges:

// Expected: valid percentage
{ "tool": "set_discount", "parameters": { "percentage": 15 } }

// Actual: impossible percentage
{ "tool": "set_discount", "parameters": { "percentage": 150 } }

// Actual: negative value
{ "tool": "set_discount", "parameters": { "percentage": -20 } }

Without validation, these values flow through your system creating bad state.

Injection Through Parameters

Tool parameters can contain malicious content:

// User asks: "Create a file called notes.txt"
// LLM generates:
{
  "tool": "write_file",
  "parameters": {
    "path": "../../../etc/passwd",  // Path traversal
    "content": "malicious content"
  }
}

// Or SQL injection through a search tool:
{
  "tool": "search_users",
  "parameters": {
    "query": "'; DROP TABLE users; --"
  }
}

The Solution

Validate all tool outputs at the boundary before execution. Use a schema validation library that provides both runtime checks and TypeScript types.

Core Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Tool Call Flow                            │
├─────────────────────────────────────────────────────────────┤
│                                                              │
│  LLM Output                                                  │
│      │                                                       │
│      ▼                                                       │
│  ┌──────────────────┐                                       │
│  │  JSON.parse()    │ ─── Parse Error ───► Retry/Fallback   │
│  └──────────────────┘                                       │
│      │                                                       │
│      ▼                                                       │
│  ┌──────────────────┐                                       │
│  │  Schema.parse()  │ ─── Validation Error ─► Retry/Reject  │
│  └──────────────────┘                                       │
│      │                                                       │
│      ▼                                                       │
│  ┌──────────────────┐                                       │
│  │ Validated Data   │ ─── Type-safe ───► Execute Tool       │
│  └──────────────────┘                                       │
│                                                              │
└─────────────────────────────────────────────────────────────┘

Zod Schema Definitions

Define schemas for each tool’s parameters:

import { z } from 'zod';

// Base tool call structure
const BaseToolCallSchema = z.object({
  tool: z.string(),
  parameters: z.record(z.unknown()),
});

// Specific tool schemas
const CreateUserSchema = z.object({
  tool: z.literal('create_user'),
  parameters: z.object({
    name: z.string().min(1).max(100),
    email: z.string().email(),
    role: z.enum(['admin', 'user', 'guest']).default('user'),
  }),
});

const TransferFundsSchema = z.object({
  tool: z.literal('transfer_funds'),
  parameters: z.object({
    from_account: z.string(),
    to_account: z.string(),
    amount: z.number().positive().max(1_000_000),
    currency: z.enum(['USD', 'EUR', 'GBP']),
  }),
});

const WriteFileSchema = z.object({
  tool: z.literal('write_file'),
  parameters: z.object({
    // Prevent path traversal
    path: z.string()
      .refine(p => !p.includes('..'), 'Path traversal not allowed')
      .refine(p => p.startsWith('/workspace/'), 'Must be in workspace'),
    content: z.string().max(100_000),
  }),
});

// Union of all tool schemas
const ToolCallSchema = z.discriminatedUnion('tool', [
  CreateUserSchema,
  TransferFundsSchema,
  WriteFileSchema,
]);

type ToolCall = z.infer<typeof ToolCallSchema>;

Validation Function

Create a reusable validation layer:

interface ValidationResult<T> {
  success: boolean;
  data?: T;
  error?: {
    message: string;
    issues: z.ZodIssue[];
  };
}

function validateToolCall(raw: unknown): ValidationResult<ToolCall> {
  // Step 1: Ensure it's valid JSON (if string input)
  let parsed: unknown;
  if (typeof raw === 'string') {
    try {
      parsed = JSON.parse(raw);
    } catch (e) {
      return {
        success: false,
        error: {
          message: 'Invalid JSON',
          issues: [{ code: 'custom', message: 'JSON parse failed', path: [] }],
        },
      };
    }
  } else {
    parsed = raw;
  }

  // Step 2: Validate against schema
  const result = ToolCallSchema.safeParse(parsed);

  if (result.success) {
    return { success: true, data: result.data };
  }

  return {
    success: false,
    error: {
      message: 'Schema validation failed',
      issues: result.error.issues,
    },
  };
}

// Usage
const rawOutput = await llm.complete(prompt);
const validation = validateToolCall(rawOutput);

if (!validation.success) {
  console.error('Tool call validation failed:', validation.error);
  // Handle error: retry, fallback, or reject
} else {
  // validation.data is typed as ToolCall
  await executeToolCall(validation.data);
}

Type-Safe Execution

Once validated, execute with full type safety:

async function executeToolCall(toolCall: ToolCall): Promise<unknown> {
  switch (toolCall.tool) {
    case 'create_user':
      // toolCall.parameters is typed as { name: string, email: string, role: 'admin' | 'user' | 'guest' }
      return await userService.create({
        name: toolCall.parameters.name,
        email: toolCall.parameters.email,
        role: toolCall.parameters.role,
      });

    case 'transfer_funds':
      // toolCall.parameters is typed with amount: number, currency: 'USD' | 'EUR' | 'GBP'
      return await bankingService.transfer({
        from: toolCall.parameters.from_account,
        to: toolCall.parameters.to_account,
        amount: toolCall.parameters.amount,
        currency: toolCall.parameters.currency,
      });

    case 'write_file':
      // Path already validated to prevent traversal
      return await fs.writeFile(
        toolCall.parameters.path,
        toolCall.parameters.content
      );

    default:
      // TypeScript knows this is unreachable if all tools are handled
      const _exhaustive: never = toolCall;
      throw new Error(`Unknown tool: ${(_exhaustive as any).tool}`);
  }
}

Implementation Patterns

Pattern 1: Schema Registry

Centralize tool schemas for maintainability:

// schemas/tools.ts
export const toolSchemas = {
  create_user: z.object({
    name: z.string().min(1),
    email: z.string().email(),
  }),
  delete_user: z.object({
    user_id: z.string().uuid(),
    confirm: z.literal(true),  // Require explicit confirmation
  }),
  search_users: z.object({
    query: z.string().min(1).max(100),
    limit: z.number().int().min(1).max(100).default(10),
  }),
} as const;

type ToolName = keyof typeof toolSchemas;
type ToolParams<T extends ToolName> = z.infer<typeof toolSchemas[T]>;

// Validate a specific tool
function validateTool<T extends ToolName>(
  name: T,
  params: unknown
): ToolParams<T> | null {
  const schema = toolSchemas[name];
  const result = schema.safeParse(params);
  return result.success ? result.data : null;
}

Pattern 2: Error Recovery with Retry

When validation fails, retry with error context:

async function executeWithRetry(
  prompt: string,
  maxRetries: number = 3
): Promise<ToolCall> {
  let lastError: string | null = null;

  for (let attempt = 0; attempt < maxRetries; attempt++) {
    // Include error context in retry prompt
    const fullPrompt = lastError
      ? `${prompt}\n\nPrevious attempt failed: ${lastError}\nPlease fix the error and try again.`
      : prompt;

    const response = await llm.complete(fullPrompt);
    const validation = validateToolCall(response);

    if (validation.success) {
      return validation.data;
    }

    // Format error for LLM
    lastError = validation.error.issues
      .map(i => `${i.path.join('.')}: ${i.message}`)
      .join('; ');

    console.log(`Attempt ${attempt + 1} failed: ${lastError}`);
  }

  throw new Error(`Tool call validation failed after ${maxRetries} attempts`);
}

Pattern 3: Structured Output Mode

Use model-specific structured output features when available:

// Anthropic Claude - response_format
const response = await anthropic.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  max_tokens: 1024,
  messages: [{ role: 'user', content: prompt }],
  // Claude's native structured output
  tools: [{
    name: 'create_user',
    description: 'Create a new user account',
    input_schema: {
      type: 'object',
      properties: {
        name: { type: 'string', minLength: 1 },
        email: { type: 'string', format: 'email' },
        role: { type: 'string', enum: ['admin', 'user', 'guest'] },
      },
      required: ['name', 'email'],
    },
  }],
  tool_choice: { type: 'tool', name: 'create_user' },
});

// Still validate! Model compliance isn't guaranteed
const toolCall = response.content.find(c => c.type === 'tool_use');
const validation = CreateUserSchema.safeParse({
  tool: toolCall.name,
  parameters: toolCall.input,
});

Pattern 4: Coercion for Common Mistakes

Handle predictable LLM mistakes with coercion:

const CoercedAmountSchema = z.object({
  amount: z.union([
    z.number(),
    // LLM might return string that looks like number
    z.string().regex(/^\d+(\.\d{1,2})?$/).transform(s => parseFloat(s)),
  ]).pipe(z.number().positive()),
});

const CoercedBooleanSchema = z.union([
  z.boolean(),
  z.literal('true').transform(() => true),
  z.literal('false').transform(() => false),
  z.literal('yes').transform(() => true),
  z.literal('no').transform(() => false),
]).pipe(z.boolean());

// Example: LLM returns { amount: "150.00", confirm: "yes" }
const FlexibleToolSchema = z.object({
  tool: z.literal('payment'),
  parameters: z.object({
    amount: CoercedAmountSchema.shape.amount,
    confirm: CoercedBooleanSchema,
  }),
});

// Parses successfully, transforms to { amount: 150, confirm: true }

Pattern 5: Validation with Logging

Track validation failures for prompt improvement:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
interface ValidationLog {
  timestamp: Date;
  prompt: string;
  rawOutput: unknown;
  success: boolean;
  errors?: z.ZodIssue[];
  toolName?: string;
}

const validationLogs: ValidationLog[] = [];

function validateAndLog(raw: unknown, prompt: string): ValidationResult<ToolCall> {
  const result = validateToolCall(raw);

  validationLogs.push({
    timestamp: new Date(),
    prompt,
    rawOutput: raw,
    success: result.success,
    errors: result.error?.issues,
    toolName: result.data?.tool,
  });

  // Periodic analysis reveals prompt weaknesses
  if (validationLogs.length % 100 === 0) {
    analyzeValidationFailures(validationLogs);
  }

  return result;
}

function analyzeValidationFailures(logs: ValidationLog[]): void {
  const failures = logs.filter(l => !l.success);
  const failureRate = failures.length / logs.length;

  // Group by error type
  const errorCounts = new Map<string, number>();
  for (const log of failures) {
    for (const issue of log.errors || []) {
      const key = `${issue.path.join('.')}: ${issue.code}`;
      errorCounts.set(key, (errorCounts.get(key) || 0) + 1);
    }
  }

  console.log(`Validation failure rate: ${(failureRate * 100).toFixed(1)}%`);
  console.log('Top errors:', [...errorCounts.entries()].sort((a, b) => b[1] - a[1]).slice(0, 5));
}

Advanced Patterns

Contextual Validation

Validate based on current application state:

function createContextualValidator(context: AppContext) {
  return z.object({
    tool: z.literal('transfer_funds'),
    parameters: z.object({
      from_account: z.string().refine(
        id => context.user.accounts.includes(id),
        'Account not owned by user'
      ),
      to_account: z.string(),
      amount: z.number()
        .positive()
        .refine(
          amt => amt <= context.user.dailyLimit,
          `Amount exceeds daily limit of ${context.user.dailyLimit}`
        ),
    }),
  });
}

// Usage
const userContext = await getUserContext(userId);
const validator = createContextualValidator(userContext);
const result = validator.safeParse(toolCall);

Semantic Validation

Validate meaning, not just structure:

const EmailActionSchema = z.object({
  tool: z.literal('send_email'),
  parameters: z.object({
    to: z.string().email(),
    subject: z.string().min(1).max(200),
    body: z.string().min(1).max(10000),
    // Semantic validation
    urgency: z.enum(['low', 'normal', 'high']),
  }),
}).refine(
  data => {
    // High urgency requires substantive body
    if (data.parameters.urgency === 'high') {
      return data.parameters.body.length >= 50;
    }
    return true;
  },
  { message: 'High urgency emails must have at least 50 characters in body' }
);

Chained Validation for Multi-Step Tools

Validate tool sequences:

const WorkflowStepSchema = z.discriminatedUnion('tool', [
  z.object({
    tool: z.literal('create_branch'),
    parameters: z.object({ branch_name: z.string().regex(/^[a-z0-9-]+$/) }),
  }),
  z.object({
    tool: z.literal('commit_changes'),
    parameters: z.object({ message: z.string().min(10) }),
  }),
  z.object({
    tool: z.literal('open_pr'),
    parameters: z.object({ title: z.string(), body: z.string() }),
  }),
]);

interface WorkflowState {
  branch_created: boolean;
  changes_committed: boolean;
}

function validateWorkflowStep(
  step: z.infer<typeof WorkflowStepSchema>,
  state: WorkflowState
): boolean {
  switch (step.tool) {
    case 'create_branch':
      return !state.branch_created;  // Can only create once
    case 'commit_changes':
      return state.branch_created;   // Must create branch first
    case 'open_pr':
      return state.branch_created && state.changes_committed;  // Must have both
  }
}

Common Pitfalls

Pitfall 1: Trusting Model-Provided Types

// Wrong: Assuming model follows schema
const toolCall = await llm.complete(prompt);
await executeToolCall(toolCall);  // No validation!

// Correct: Always validate
const toolCall = await llm.complete(prompt);
const validated = validateToolCall(toolCall);
if (validated.success) {
  await executeToolCall(validated.data);
}

Pitfall 2: Validation Without Error Handling

// Wrong: Validation that crashes on failure
const data = ToolCallSchema.parse(raw);  // Throws on invalid input

// Correct: Safe parsing with error handling
const result = ToolCallSchema.safeParse(raw);
if (!result.success) {
  // Handle gracefully
  return { error: 'Invalid tool call', details: result.error.issues };
}

Pitfall 3: Loose String Patterns

// Wrong: Accept any string
const FilePathSchema = z.object({
  path: z.string(),
});

// Correct: Constrain to safe patterns
const FilePathSchema = z.object({
  path: z.string()
    .min(1)
    .max(255)
    .regex(/^[\w\-./]+$/, 'Invalid characters in path')
    .refine(p => !p.includes('..'), 'Path traversal not allowed')
    .refine(p => !p.startsWith('/'), 'Absolute paths not allowed'),
});

Pitfall 4: Missing Default Handling

// Wrong: Assuming defaults are applied
const ToolSchema = z.object({
  limit: z.number().default(10),
});
const data = ToolSchema.parse({ limit: undefined });
// data.limit is undefined, not 10! (undefined is a valid value)

// Correct: Use optional().default() or handle explicitly
const ToolSchema = z.object({
  limit: z.number().optional().default(10),
});
// Or transform:
const ToolSchema = z.object({
  limit: z.number().nullish().transform(v => v ?? 10),
});

Pitfall 5: No Validation Metrics

// Wrong: Silent failures
const result = validateToolCall(raw);
if (!result.success) {
  return retry();  // No visibility into failure patterns
}

// Correct: Track and analyze
const result = validateToolCall(raw);
if (!result.success) {
  metrics.increment('tool_validation.failure', {
    tool: extractToolName(raw),
    error_type: categorizeError(result.error),
  });
  return retry();
}

Benefits

1. Runtime Safety

Validation catches errors before they corrupt state:

Without validation:
- Invalid amount passed to payment system
- Transaction created with wrong value
- Refund required, customer support involved
- Hours of debugging

With validation:
- Invalid amount caught at boundary
- Clear error: "amount must be positive"
- LLM retries with corrected value
- Transaction succeeds on second attempt

2. Type Inference

Zod provides TypeScript types automatically:

const UserSchema = z.object({
  name: z.string(),
  email: z.string().email(),
});

type User = z.infer<typeof UserSchema>;
// User = { name: string; email: string }

// IDE autocomplete works after validation
const result = UserSchema.safeParse(data);
if (result.success) {
  console.log(result.data.email);  // TypeScript knows email exists
}

3. Documentation as Code

Schemas document expected formats:

// Schema IS the documentation
const CreateOrderSchema = z.object({
  tool: z.literal('create_order'),
  parameters: z.object({
    customer_id: z.string().uuid(),
    items: z.array(z.object({
      product_id: z.string(),
      quantity: z.number().int().positive().max(100),
      price_override: z.number().positive().optional(),
    })).min(1).max(50),
    shipping_address: AddressSchema,
    notes: z.string().max(500).optional(),
  }),
});

4. Testability

Schemas enable property-based testing:

import { fc } from 'fast-check';
import { zodToArbitrary } from 'zod-fast-check';

const userArbitrary = zodToArbitrary(UserSchema);

test('user processing handles all valid inputs', () => {
  fc.assert(
    fc.property(userArbitrary, (user) => {
      const result = processUser(user);
      expect(result.success).toBe(true);
    })
  );
});

Related

References

Topics
Agent ReliabilityBoundariesError HandlingJson SchemaRuntime ValidationStructured OutputsTool CallsType SafetyValidationZod

More Insights

Cover Image for Own Your Control Plane

Own Your Control Plane

If you use someone else’s task manager, you inherit all of their abstractions. In a world where LLMs make software a solved problem, the cost of ownership has flipped.

James Phoenix
James Phoenix
Cover Image for Indexed PRD and Design Doc Strategy

Indexed PRD and Design Doc Strategy

A documentation-driven development pattern where a single `index.md` links all PRDs and design documents, creating navigable context for both humans and AI agents.

James Phoenix
James Phoenix