Batch API Patterns for Cost Reduction

James Phoenix
James Phoenix

Summary

The Anthropic Batch API provides 50% cost reduction on API calls by processing requests asynchronously with up to 24-hour turnaround. For non-time-sensitive workloads like nightly code reviews, test generation, documentation updates, and bulk refactoring, batch processing cuts costs in half while maintaining full model capabilities.

The Problem

Interactive AI assistance requires immediate responses. You ask a question, you get an answer. But many development workflows don’t need real-time interaction:

  • Nightly code quality scans
  • Bulk test generation for new modules
  • Documentation updates across a codebase
  • Large-scale refactoring validation
  • Security vulnerability analysis
  • License compliance checks

Running these tasks interactively wastes money. You pay full price for work that could wait until morning.

The Economics

Consider a nightly code review workflow:

Interactive API (real-time):
- 50 files reviewed
- 5,000 tokens input per file
- 500 tokens output per file
- Model: Claude Sonnet

Input cost: 50 × 5,000 × $0.000003 = $0.75
Output cost: 50 × 500 × $0.000015 = $0.375
Total: $1.125 per run
Monthly (22 workdays): $24.75

Batch API (50% discount):
Input cost: 50 × 5,000 × $0.0000015 = $0.375
Output cost: 50 × 500 × $0.0000075 = $0.1875
Total: $0.5625 per run
Monthly: $12.38

Annual savings: $148.50 per workflow

For teams running multiple batch workflows (code review, test generation, docs), savings compound to thousands per year.

The Solution

Use the Batch API for any workload that can tolerate up to 24 hours of latency. Structure work as independent requests, submit in batches, and poll for completion.

Core Concepts

Batch vs Interactive:

Characteristic Interactive API Batch API
Latency Seconds Up to 24 hours
Cost Full price 50% discount
Rate limits Standard Higher throughput
Use case Interactive dev Scheduled jobs
Minimum requests 1 1
Maximum requests 1 10,000 per batch

How Batch Processing Works:

  1. Create a batch with multiple requests
  2. Submit the batch to Anthropic
  3. Poll for completion (or receive webhook)
  4. Retrieve results when ready

The API processes requests in parallel on Anthropic’s infrastructure. You don’t manage concurrency. You just submit work and wait.

Implementation: Basic Batch Submission

import Anthropic from '@anthropic-ai/sdk'

const client = new Anthropic()

interface BatchRequest {
  custom_id: string
  params: Anthropic.MessageCreateParams
}

async function submitBatch(requests: BatchRequest[]): Promise<string> {
  // Create the batch
  const batch = await client.beta.messages.batches.create({
    requests: requests.map(req => ({
      custom_id: req.custom_id,
      params: req.params
    }))
  })

  console.log(`Batch submitted: ${batch.id}`)
  console.log(`Requests: ${requests.length}`)
  console.log(`Status: ${batch.processing_status}`)

  return batch.id
}

// Example: Submit code reviews for multiple files
async function submitCodeReviews(files: Array<{ path: string; content: string }>) {
  const requests: BatchRequest[] = files.map((file, index) => ({
    custom_id: `review-${index}-${file.path}`,
    params: {
      model: 'claude-sonnet-4-5-20250929',
      max_tokens: 2048,
      messages: [{
        role: 'user',
        content: `Review this code for bugs, security issues, and style problems.

File: ${file.path}

\`\`\`
${file.content}
\`\`\`

Provide specific, actionable feedback.`
      }]
    }
  }))

  return submitBatch(requests)
}

Implementation: Polling for Results

interface BatchResult {
  custom_id: string
  result: {
    type: 'succeeded' | 'errored' | 'expired'
    message?: Anthropic.Message
    error?: { type: string; message: string }
  }
}

async function pollBatchResults(
  batchId: string,
  maxWaitMinutes: number = 60
): Promise<BatchResult[]> {
  const startTime = Date.now()
  const maxWaitMs = maxWaitMinutes * 60 * 1000

  while (Date.now() - startTime < maxWaitMs) {
    const batch = await client.beta.messages.batches.retrieve(batchId)

    console.log(`Status: ${batch.processing_status}`)
    console.log(`Progress: ${batch.request_counts.succeeded}/${batch.request_counts.processing}`)

    if (batch.processing_status === 'ended') {
      // Fetch all results
      const results: BatchResult[] = []

      for await (const result of client.beta.messages.batches.results(batchId)) {
        results.push(result as BatchResult)
      }

      return results
    }

    // Wait before polling again (exponential backoff)
    const waitTime = Math.min(30000, 5000 * Math.pow(1.5, Math.floor((Date.now() - startTime) / 60000)))
    await new Promise(resolve => setTimeout(resolve, waitTime))
  }

  throw new Error(`Batch ${batchId} did not complete within ${maxWaitMinutes} minutes`)
}

// Process results
function processCodeReviewResults(results: BatchResult[]) {
  const reviews: Array<{ file: string; feedback: string; status: string }> = []

  for (const result of results) {
    const filePath = result.custom_id.replace(/^review-\d+-/, '')

    if (result.result.type === 'succeeded' && result.result.message) {
      const content = result.result.message.content[0]
      reviews.push({
        file: filePath,
        feedback: content.type === 'text' ? content.text : '',
        status: 'success'
      })
    } else {
      reviews.push({
        file: filePath,
        feedback: result.result.error?.message || 'Unknown error',
        status: 'error'
      })
    }
  }

  return reviews
}

Use Case Patterns

Pattern 1: Nightly Code Review

Run comprehensive code analysis overnight when nobody needs interactive feedback.

// scripts/nightly-review.ts
import { glob } from 'glob'
import { readFile } from 'fs/promises'

async function runNightlyReview() {
  // Gather files changed in the last 24 hours
  const changedFiles = await getRecentlyChangedFiles(24)

  console.log(`Found ${changedFiles.length} files to review`)

  // Filter to code files only
  const codeFiles = changedFiles.filter(f =>
    /\.(ts|tsx|js|jsx|py|go|rs)$/.test(f)
  )

  // Read file contents
  const filesWithContent = await Promise.all(
    codeFiles.map(async path => ({
      path,
      content: await readFile(path, 'utf-8')
    }))
  )

  // Submit batch
  const batchId = await submitCodeReviews(filesWithContent)

  // Store batch ID for morning retrieval
  await writeFile('.batch-state/current-review.json', JSON.stringify({
    batchId,
    submittedAt: new Date().toISOString(),
    fileCount: filesWithContent.length
  }))

  console.log(`Batch ${batchId} submitted for ${filesWithContent.length} files`)
  console.log('Results will be ready by morning')
}

// Run at 2 AM via cron/scheduler
runNightlyReview()
# .github/workflows/nightly-review.yml
name: Nightly Code Review

on:
  schedule:
    - cron: '0 2 * * *'  # 2 AM daily

jobs:
  submit-review:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0  # Need history for changed files

      - name: Submit batch review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: npx ts-node scripts/nightly-review.ts

      - name: Save batch state
        uses: actions/upload-artifact@v4
        with:
          name: batch-state
          path: .batch-state/

  collect-results:
    runs-on: ubuntu-latest
    needs: submit-review
    # Run 4 hours later to collect results
    if: ${{ always() }}
    steps:
      - uses: actions/checkout@v4

      - name: Download batch state
        uses: actions/download-artifact@v4
        with:
          name: batch-state
          path: .batch-state/

      - name: Collect and report results
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: npx ts-node scripts/collect-review-results.ts

      - name: Create issue with findings
        uses: actions/github-script@v7
        with:
          script: |
            const results = require('./.batch-state/review-results.json')
            // Create GitHub issue with review findings

Pattern 2: Bulk Test Generation

Generate tests for an entire module at once, not file by file.

interface TestGenRequest {
  sourcePath: string
  sourceContent: string
  existingTests?: string
  framework: 'jest' | 'vitest' | 'pytest'
}

async function generateTestsBatch(requests: TestGenRequest[]): Promise<string> {
  const batchRequests = requests.map((req, i) => ({
    custom_id: `test-gen-${i}-${req.sourcePath}`,
    params: {
      model: 'claude-sonnet-4-5-20250929',
      max_tokens: 4096,
      messages: [{
        role: 'user',
        content: `Generate comprehensive tests for this code using ${req.framework}.

Source file: ${req.sourcePath}

\`\`\`
${req.sourceContent}
\`\`\`

${req.existingTests ? `Existing tests (don't duplicate):\n\`\`\`\n${req.existingTests}\n\`\`\`` : ''}

Requirements:
- Test all public functions/methods
- Include edge cases and error conditions
- Use descriptive test names
- Mock external dependencies
- Follow ${req.framework} best practices

Output only the test code, no explanations.`
      }]
    }
  }))

  return submitBatch(batchRequests)
}

// Usage: Generate tests for all files in a directory
async function generateModuleTests(moduleDir: string) {
  const sourceFiles = await glob(`${moduleDir}/**/*.ts`, {
    ignore: ['**/*.test.ts', '**/*.spec.ts']
  })

  const requests = await Promise.all(
    sourceFiles.map(async path => {
      const testPath = path.replace(/\.ts$/, '.test.ts')
      const existingTests = await readFile(testPath, 'utf-8').catch(() => undefined)

      return {
        sourcePath: path,
        sourceContent: await readFile(path, 'utf-8'),
        existingTests,
        framework: 'vitest' as const
      }
    })
  )

  const batchId = await generateTestsBatch(requests)
  console.log(`Test generation batch: ${batchId}`)

  return batchId
}

Pattern 3: Documentation Updates

Bulk-generate or update documentation across a codebase.

async function updateDocsBatch(
  files: Array<{ path: string; content: string; existingDocs?: string }>
): Promise<string> {
  const requests = files.map((file, i) => ({
    custom_id: `docs-${i}-${file.path}`,
    params: {
      model: 'claude-sonnet-4-5-20250929',
      max_tokens: 3000,
      messages: [{
        role: 'user',
        content: `Generate or update JSDoc/TSDoc documentation for this code.

File: ${file.path}

\`\`\`typescript
${file.content}
\`\`\`

${file.existingDocs ? `Current documentation:\n${file.existingDocs}\n\nUpdate if outdated.` : 'No existing documentation.'}

Requirements:
- Document all exported functions, classes, and types
- Include @param, @returns, @throws, @example where appropriate
- Keep descriptions concise but complete
- Match existing documentation style if present

Output only the documented code, no explanations.`
      }]
    }
  }))

  return submitBatch(requests)
}

Pattern 4: Security Vulnerability Analysis

Batch-scan code for security issues without blocking development.

const SECURITY_PROMPT = `Analyze this code for security vulnerabilities.

Focus on:
1. Injection vulnerabilities (SQL, command, XSS)
2. Authentication and authorization flaws
3. Sensitive data exposure
4. Insecure dependencies usage
5. Cryptographic issues
6. Race conditions
7. Resource leaks

For each issue found, provide:
- Severity (Critical/High/Medium/Low)
- Location (line number or function)
- Description of the vulnerability
- Remediation guidance

If no issues found, state "No security issues identified."`

async function runSecurityScan(files: Array<{ path: string; content: string }>) {
  const requests = files.map((file, i) => ({
    custom_id: `security-${i}-${file.path}`,
    params: {
      model: 'claude-sonnet-4-5-20250929',
      max_tokens: 2048,
      messages: [{
        role: 'user',
        content: `${SECURITY_PROMPT}

File: ${file.path}

\`\`\`
${file.content}
\`\`\``
      }]
    }
  }))

  return submitBatch(requests)
}

Pattern 5: Large-Scale Refactoring Validation

Validate proposed refactors across many files before applying.

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
interface RefactorValidation {
  originalPath: string
  originalContent: string
  proposedContent: string
  refactorDescription: string
}

async function validateRefactorsBatch(validations: RefactorValidation[]) {
  const requests = validations.map((v, i) => ({
    custom_id: `refactor-${i}-${v.originalPath}`,
    params: {
      model: 'claude-sonnet-4-5-20250929',
      max_tokens: 2048,
      messages: [{
        role: 'user',
        content: `Validate this refactoring for correctness and potential issues.

Refactoring: ${v.refactorDescription}

Original code (${v.originalPath}):
\`\`\`
${v.originalContent}
\`\`\`

Proposed code:
\`\`\`
${v.proposedContent}
\`\`\`

Check for:
1. Behavior changes (intended vs unintended)
2. Type compatibility
3. Missing imports or exports
4. Broken references
5. Edge case handling changes

Output:
- VALID if refactoring is correct
- INVALID with specific issues if problems found`
      }]
    }
  }))

  return submitBatch(requests)
}

Workflow Integration

GitHub Actions: Submit and Collect Pattern

# Two-job pattern: submit batch, then collect results later
name: Batch Processing Workflow

on:
  schedule:
    - cron: '0 22 * * *'  # 10 PM: submit batch
  workflow_dispatch:
    inputs:
      action:
        type: choice
        options:
          - submit
          - collect

jobs:
  submit-batch:
    if: github.event_name == 'schedule' || github.event.inputs.action == 'submit'
    runs-on: ubuntu-latest
    outputs:
      batch_id: ${{ steps.submit.outputs.batch_id }}
    steps:
      - uses: actions/checkout@v4

      - name: Submit batch
        id: submit
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          BATCH_ID=$(node scripts/submit-batch.js)
          echo "batch_id=$BATCH_ID" >> $GITHUB_OUTPUT

      - name: Store batch ID
        run: |
          echo "${{ steps.submit.outputs.batch_id }}" > .batch-id

      - uses: actions/upload-artifact@v4
        with:
          name: batch-id
          path: .batch-id
          retention-days: 1

  collect-results:
    if: github.event.inputs.action == 'collect'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/download-artifact@v4
        with:
          name: batch-id

      - name: Collect results
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          BATCH_ID=$(cat .batch-id)
          node scripts/collect-results.js "$BATCH_ID"

Webhook-Based Collection (Recommended)

Instead of polling, receive notifications when batches complete:

// server/webhook-handler.ts
import express from 'express'
import crypto from 'crypto'

const app = express()

// Verify webhook signature
function verifyWebhookSignature(payload: string, signature: string): boolean {
  const secret = process.env.ANTHROPIC_WEBHOOK_SECRET!
  const expected = crypto
    .createHmac('sha256', secret)
    .update(payload)
    .digest('hex')
  return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))
}

app.post('/webhooks/anthropic', express.raw({ type: 'application/json' }), async (req, res) => {
  const signature = req.headers['anthropic-signature'] as string
  const payload = req.body.toString()

  if (!verifyWebhookSignature(payload, signature)) {
    return res.status(401).send('Invalid signature')
  }

  const event = JSON.parse(payload)

  if (event.type === 'batch.completed') {
    const batchId = event.data.batch_id
    console.log(`Batch ${batchId} completed`)

    // Trigger result processing
    await processBatchResults(batchId)
  }

  res.status(200).send('OK')
})

async function processBatchResults(batchId: string) {
  const results = await pollBatchResults(batchId, 1)  // Should be immediate

  // Store results
  await writeFile(
    `results/${batchId}.json`,
    JSON.stringify(results, null, 2)
  )

  // Notify team
  await sendSlackMessage(`Batch ${batchId} complete: ${results.length} results`)
}

Cost Comparison Table

Workflow Interactive Cost Batch Cost Monthly Savings
50 file code review $1.125 $0.56 $12.43
100 test generations $4.50 $2.25 $49.50
200 doc updates $3.00 $1.50 $33.00
Security scan (100 files) $2.25 $1.13 $24.64
Total (daily) $10.88 $5.44 $119.57

Assumptions: Sonnet model, 5K input / 1K output tokens per request, 22 workdays/month.

Best Practices

1. Batch Similar Work Together

Group requests by type for easier result processing:

// Good: Homogeneous batches
const reviewBatch = await submitBatch(codeReviewRequests)
const testBatch = await submitBatch(testGenRequests)

// Avoid: Mixed batches that complicate result handling
const mixedBatch = await submitBatch([...reviews, ...tests, ...docs])

2. Use Meaningful Custom IDs

Include enough context to process results without the original request:

// Good: Rich context in ID
custom_id: `review-${sha.slice(0, 8)}-src/api/users.ts`

// Bad: Opaque ID
custom_id: `req-${Math.random()}`

3. Handle Partial Failures

Some requests may fail while others succeed. Always check result types:

for (const result of batchResults) {
  if (result.result.type === 'succeeded') {
    processSuccess(result)
  } else if (result.result.type === 'errored') {
    logError(result.custom_id, result.result.error)
    // Optionally retry failed requests
    failedRequests.push(extractOriginalRequest(result.custom_id))
  } else if (result.result.type === 'expired') {
    // Request timed out within Anthropic's processing
    retryRequests.push(extractOriginalRequest(result.custom_id))
  }
}

4. Set Appropriate Timeouts

Batch processing can take hours. Plan workflows accordingly:

  • Simple requests: 1-4 hours typical
  • Complex requests: 4-12 hours typical
  • Maximum: 24 hours guaranteed

5. Implement Idempotency

Use deterministic custom_ids so reruns don’t create duplicates:

// Idempotent: Same input produces same ID
const customId = `review-${sha}-${filePath}`

// Check if already processed
if (await hasResult(customId)) {
  console.log(`Already processed: ${customId}`)
  return getCachedResult(customId)
}

Common Pitfalls

Pitfall 1: Polling Too Aggressively

Polling every second wastes resources and may trigger rate limits.

// Bad: Aggressive polling
while (!done) {
  await checkStatus()
  await sleep(1000)  // Every second
}

// Good: Exponential backoff
let waitTime = 5000
while (!done) {
  await checkStatus()
  await sleep(waitTime)
  waitTime = Math.min(waitTime * 1.5, 60000)  // Cap at 1 minute
}

Pitfall 2: Not Storing Batch IDs

If your process crashes, you lose access to pending batches.

// Always persist batch IDs immediately after submission
const batchId = await submitBatch(requests)
await writeFile('.batch-state/pending.json', JSON.stringify({
  batchId,
  submittedAt: new Date().toISOString(),
  requestCount: requests.length
}))

Pitfall 3: Ignoring Cost Savings from Caching

Batch API combined with prompt caching provides even greater savings:

// Structure requests for cache hits
const systemPrompt = await readFile('prompts/code-review.md', 'utf-8')

const requests = files.map(file => ({
  custom_id: `review-${file.path}`,
  params: {
    model: 'claude-sonnet-4-5-20250929',
    max_tokens: 2048,
    messages: [{
      role: 'user',
      content: [
        {
          type: 'text',
          text: systemPrompt,  // Same for all requests - cacheable
          cache_control: { type: 'ephemeral' }
        },
        {
          type: 'text',
          text: `Review: ${file.path}\n\n${file.content}`
        }
      ]
    }]
  }
}))

Pitfall 4: Batch Too Large

While 10,000 requests per batch is allowed, processing time increases:

  • 100 requests: 1-2 hours typical
  • 1,000 requests: 4-8 hours typical
  • 10,000 requests: 12-24 hours typical

Split large workloads into multiple batches for faster partial results.

When NOT to Use Batch API

Batch processing is wrong for:

  • Interactive development: Real-time assistance needs immediate responses
  • Debugging sessions: Quick back-and-forth requires low latency
  • Time-sensitive operations: Deployments, incident response
  • Small workloads: Overhead of batch submission not worth it for <10 requests
  • Dependent requests: When request B needs output from request A

Related

References

Topics
Anthropic ApiAsync WorkflowsBatch ProcessingBulk OperationsCi CdCode ReviewCost OptimizationScheduled JobsTest Generation

More Insights

Cover Image for Own Your Control Plane

Own Your Control Plane

If you use someone else’s task manager, you inherit all of their abstractions. In a world where LLMs make software a solved problem, the cost of ownership has flipped.

James Phoenix
James Phoenix
Cover Image for Indexed PRD and Design Doc Strategy

Indexed PRD and Design Doc Strategy

A documentation-driven development pattern where a single `index.md` links all PRDs and design documents, creating navigable context for both humans and AI agents.

James Phoenix
James Phoenix