Summary
The Anthropic Batch API provides 50% cost reduction on API calls by processing requests asynchronously with up to 24-hour turnaround. For non-time-sensitive workloads like nightly code reviews, test generation, documentation updates, and bulk refactoring, batch processing cuts costs in half while maintaining full model capabilities.
The Problem
Interactive AI assistance requires immediate responses. You ask a question, you get an answer. But many development workflows don’t need real-time interaction:
- Nightly code quality scans
- Bulk test generation for new modules
- Documentation updates across a codebase
- Large-scale refactoring validation
- Security vulnerability analysis
- License compliance checks
Running these tasks interactively wastes money. You pay full price for work that could wait until morning.
The Economics
Consider a nightly code review workflow:
Interactive API (real-time):
- 50 files reviewed
- 5,000 tokens input per file
- 500 tokens output per file
- Model: Claude Sonnet
Input cost: 50 × 5,000 × $0.000003 = $0.75
Output cost: 50 × 500 × $0.000015 = $0.375
Total: $1.125 per run
Monthly (22 workdays): $24.75
Batch API (50% discount):
Input cost: 50 × 5,000 × $0.0000015 = $0.375
Output cost: 50 × 500 × $0.0000075 = $0.1875
Total: $0.5625 per run
Monthly: $12.38
Annual savings: $148.50 per workflow
For teams running multiple batch workflows (code review, test generation, docs), savings compound to thousands per year.
The Solution
Use the Batch API for any workload that can tolerate up to 24 hours of latency. Structure work as independent requests, submit in batches, and poll for completion.
Core Concepts
Batch vs Interactive:
| Characteristic | Interactive API | Batch API |
|---|---|---|
| Latency | Seconds | Up to 24 hours |
| Cost | Full price | 50% discount |
| Rate limits | Standard | Higher throughput |
| Use case | Interactive dev | Scheduled jobs |
| Minimum requests | 1 | 1 |
| Maximum requests | 1 | 10,000 per batch |
How Batch Processing Works:
- Create a batch with multiple requests
- Submit the batch to Anthropic
- Poll for completion (or receive webhook)
- Retrieve results when ready
The API processes requests in parallel on Anthropic’s infrastructure. You don’t manage concurrency. You just submit work and wait.
Implementation: Basic Batch Submission
import Anthropic from '@anthropic-ai/sdk'
const client = new Anthropic()
interface BatchRequest {
custom_id: string
params: Anthropic.MessageCreateParams
}
async function submitBatch(requests: BatchRequest[]): Promise<string> {
// Create the batch
const batch = await client.beta.messages.batches.create({
requests: requests.map(req => ({
custom_id: req.custom_id,
params: req.params
}))
})
console.log(`Batch submitted: ${batch.id}`)
console.log(`Requests: ${requests.length}`)
console.log(`Status: ${batch.processing_status}`)
return batch.id
}
// Example: Submit code reviews for multiple files
async function submitCodeReviews(files: Array<{ path: string; content: string }>) {
const requests: BatchRequest[] = files.map((file, index) => ({
custom_id: `review-${index}-${file.path}`,
params: {
model: 'claude-sonnet-4-5-20250929',
max_tokens: 2048,
messages: [{
role: 'user',
content: `Review this code for bugs, security issues, and style problems.
File: ${file.path}
\`\`\`
${file.content}
\`\`\`
Provide specific, actionable feedback.`
}]
}
}))
return submitBatch(requests)
}
Implementation: Polling for Results
interface BatchResult {
custom_id: string
result: {
type: 'succeeded' | 'errored' | 'expired'
message?: Anthropic.Message
error?: { type: string; message: string }
}
}
async function pollBatchResults(
batchId: string,
maxWaitMinutes: number = 60
): Promise<BatchResult[]> {
const startTime = Date.now()
const maxWaitMs = maxWaitMinutes * 60 * 1000
while (Date.now() - startTime < maxWaitMs) {
const batch = await client.beta.messages.batches.retrieve(batchId)
console.log(`Status: ${batch.processing_status}`)
console.log(`Progress: ${batch.request_counts.succeeded}/${batch.request_counts.processing}`)
if (batch.processing_status === 'ended') {
// Fetch all results
const results: BatchResult[] = []
for await (const result of client.beta.messages.batches.results(batchId)) {
results.push(result as BatchResult)
}
return results
}
// Wait before polling again (exponential backoff)
const waitTime = Math.min(30000, 5000 * Math.pow(1.5, Math.floor((Date.now() - startTime) / 60000)))
await new Promise(resolve => setTimeout(resolve, waitTime))
}
throw new Error(`Batch ${batchId} did not complete within ${maxWaitMinutes} minutes`)
}
// Process results
function processCodeReviewResults(results: BatchResult[]) {
const reviews: Array<{ file: string; feedback: string; status: string }> = []
for (const result of results) {
const filePath = result.custom_id.replace(/^review-\d+-/, '')
if (result.result.type === 'succeeded' && result.result.message) {
const content = result.result.message.content[0]
reviews.push({
file: filePath,
feedback: content.type === 'text' ? content.text : '',
status: 'success'
})
} else {
reviews.push({
file: filePath,
feedback: result.result.error?.message || 'Unknown error',
status: 'error'
})
}
}
return reviews
}
Use Case Patterns
Pattern 1: Nightly Code Review
Run comprehensive code analysis overnight when nobody needs interactive feedback.
// scripts/nightly-review.ts
import { glob } from 'glob'
import { readFile } from 'fs/promises'
async function runNightlyReview() {
// Gather files changed in the last 24 hours
const changedFiles = await getRecentlyChangedFiles(24)
console.log(`Found ${changedFiles.length} files to review`)
// Filter to code files only
const codeFiles = changedFiles.filter(f =>
/\.(ts|tsx|js|jsx|py|go|rs)$/.test(f)
)
// Read file contents
const filesWithContent = await Promise.all(
codeFiles.map(async path => ({
path,
content: await readFile(path, 'utf-8')
}))
)
// Submit batch
const batchId = await submitCodeReviews(filesWithContent)
// Store batch ID for morning retrieval
await writeFile('.batch-state/current-review.json', JSON.stringify({
batchId,
submittedAt: new Date().toISOString(),
fileCount: filesWithContent.length
}))
console.log(`Batch ${batchId} submitted for ${filesWithContent.length} files`)
console.log('Results will be ready by morning')
}
// Run at 2 AM via cron/scheduler
runNightlyReview()
# .github/workflows/nightly-review.yml
name: Nightly Code Review
on:
schedule:
- cron: '0 2 * * *' # 2 AM daily
jobs:
submit-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Need history for changed files
- name: Submit batch review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: npx ts-node scripts/nightly-review.ts
- name: Save batch state
uses: actions/upload-artifact@v4
with:
name: batch-state
path: .batch-state/
collect-results:
runs-on: ubuntu-latest
needs: submit-review
# Run 4 hours later to collect results
if: ${{ always() }}
steps:
- uses: actions/checkout@v4
- name: Download batch state
uses: actions/download-artifact@v4
with:
name: batch-state
path: .batch-state/
- name: Collect and report results
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: npx ts-node scripts/collect-review-results.ts
- name: Create issue with findings
uses: actions/github-script@v7
with:
script: |
const results = require('./.batch-state/review-results.json')
// Create GitHub issue with review findings
Pattern 2: Bulk Test Generation
Generate tests for an entire module at once, not file by file.
interface TestGenRequest {
sourcePath: string
sourceContent: string
existingTests?: string
framework: 'jest' | 'vitest' | 'pytest'
}
async function generateTestsBatch(requests: TestGenRequest[]): Promise<string> {
const batchRequests = requests.map((req, i) => ({
custom_id: `test-gen-${i}-${req.sourcePath}`,
params: {
model: 'claude-sonnet-4-5-20250929',
max_tokens: 4096,
messages: [{
role: 'user',
content: `Generate comprehensive tests for this code using ${req.framework}.
Source file: ${req.sourcePath}
\`\`\`
${req.sourceContent}
\`\`\`
${req.existingTests ? `Existing tests (don't duplicate):\n\`\`\`\n${req.existingTests}\n\`\`\`` : ''}
Requirements:
- Test all public functions/methods
- Include edge cases and error conditions
- Use descriptive test names
- Mock external dependencies
- Follow ${req.framework} best practices
Output only the test code, no explanations.`
}]
}
}))
return submitBatch(batchRequests)
}
// Usage: Generate tests for all files in a directory
async function generateModuleTests(moduleDir: string) {
const sourceFiles = await glob(`${moduleDir}/**/*.ts`, {
ignore: ['**/*.test.ts', '**/*.spec.ts']
})
const requests = await Promise.all(
sourceFiles.map(async path => {
const testPath = path.replace(/\.ts$/, '.test.ts')
const existingTests = await readFile(testPath, 'utf-8').catch(() => undefined)
return {
sourcePath: path,
sourceContent: await readFile(path, 'utf-8'),
existingTests,
framework: 'vitest' as const
}
})
)
const batchId = await generateTestsBatch(requests)
console.log(`Test generation batch: ${batchId}`)
return batchId
}
Pattern 3: Documentation Updates
Bulk-generate or update documentation across a codebase.
async function updateDocsBatch(
files: Array<{ path: string; content: string; existingDocs?: string }>
): Promise<string> {
const requests = files.map((file, i) => ({
custom_id: `docs-${i}-${file.path}`,
params: {
model: 'claude-sonnet-4-5-20250929',
max_tokens: 3000,
messages: [{
role: 'user',
content: `Generate or update JSDoc/TSDoc documentation for this code.
File: ${file.path}
\`\`\`typescript
${file.content}
\`\`\`
${file.existingDocs ? `Current documentation:\n${file.existingDocs}\n\nUpdate if outdated.` : 'No existing documentation.'}
Requirements:
- Document all exported functions, classes, and types
- Include @param, @returns, @throws, @example where appropriate
- Keep descriptions concise but complete
- Match existing documentation style if present
Output only the documented code, no explanations.`
}]
}
}))
return submitBatch(requests)
}
Pattern 4: Security Vulnerability Analysis
Batch-scan code for security issues without blocking development.
const SECURITY_PROMPT = `Analyze this code for security vulnerabilities.
Focus on:
1. Injection vulnerabilities (SQL, command, XSS)
2. Authentication and authorization flaws
3. Sensitive data exposure
4. Insecure dependencies usage
5. Cryptographic issues
6. Race conditions
7. Resource leaks
For each issue found, provide:
- Severity (Critical/High/Medium/Low)
- Location (line number or function)
- Description of the vulnerability
- Remediation guidance
If no issues found, state "No security issues identified."`
async function runSecurityScan(files: Array<{ path: string; content: string }>) {
const requests = files.map((file, i) => ({
custom_id: `security-${i}-${file.path}`,
params: {
model: 'claude-sonnet-4-5-20250929',
max_tokens: 2048,
messages: [{
role: 'user',
content: `${SECURITY_PROMPT}
File: ${file.path}
\`\`\`
${file.content}
\`\`\``
}]
}
}))
return submitBatch(requests)
}
Pattern 5: Large-Scale Refactoring Validation
Validate proposed refactors across many files before applying.
interface RefactorValidation {
originalPath: string
originalContent: string
proposedContent: string
refactorDescription: string
}
async function validateRefactorsBatch(validations: RefactorValidation[]) {
const requests = validations.map((v, i) => ({
custom_id: `refactor-${i}-${v.originalPath}`,
params: {
model: 'claude-sonnet-4-5-20250929',
max_tokens: 2048,
messages: [{
role: 'user',
content: `Validate this refactoring for correctness and potential issues.
Refactoring: ${v.refactorDescription}
Original code (${v.originalPath}):
\`\`\`
${v.originalContent}
\`\`\`
Proposed code:
\`\`\`
${v.proposedContent}
\`\`\`
Check for:
1. Behavior changes (intended vs unintended)
2. Type compatibility
3. Missing imports or exports
4. Broken references
5. Edge case handling changes
Output:
- VALID if refactoring is correct
- INVALID with specific issues if problems found`
}]
}
}))
return submitBatch(requests)
}
Workflow Integration
GitHub Actions: Submit and Collect Pattern
# Two-job pattern: submit batch, then collect results later
name: Batch Processing Workflow
on:
schedule:
- cron: '0 22 * * *' # 10 PM: submit batch
workflow_dispatch:
inputs:
action:
type: choice
options:
- submit
- collect
jobs:
submit-batch:
if: github.event_name == 'schedule' || github.event.inputs.action == 'submit'
runs-on: ubuntu-latest
outputs:
batch_id: ${{ steps.submit.outputs.batch_id }}
steps:
- uses: actions/checkout@v4
- name: Submit batch
id: submit
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
BATCH_ID=$(node scripts/submit-batch.js)
echo "batch_id=$BATCH_ID" >> $GITHUB_OUTPUT
- name: Store batch ID
run: |
echo "${{ steps.submit.outputs.batch_id }}" > .batch-id
- uses: actions/upload-artifact@v4
with:
name: batch-id
path: .batch-id
retention-days: 1
collect-results:
if: github.event.inputs.action == 'collect'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
with:
name: batch-id
- name: Collect results
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
BATCH_ID=$(cat .batch-id)
node scripts/collect-results.js "$BATCH_ID"
Webhook-Based Collection (Recommended)
Instead of polling, receive notifications when batches complete:
// server/webhook-handler.ts
import express from 'express'
import crypto from 'crypto'
const app = express()
// Verify webhook signature
function verifyWebhookSignature(payload: string, signature: string): boolean {
const secret = process.env.ANTHROPIC_WEBHOOK_SECRET!
const expected = crypto
.createHmac('sha256', secret)
.update(payload)
.digest('hex')
return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected))
}
app.post('/webhooks/anthropic', express.raw({ type: 'application/json' }), async (req, res) => {
const signature = req.headers['anthropic-signature'] as string
const payload = req.body.toString()
if (!verifyWebhookSignature(payload, signature)) {
return res.status(401).send('Invalid signature')
}
const event = JSON.parse(payload)
if (event.type === 'batch.completed') {
const batchId = event.data.batch_id
console.log(`Batch ${batchId} completed`)
// Trigger result processing
await processBatchResults(batchId)
}
res.status(200).send('OK')
})
async function processBatchResults(batchId: string) {
const results = await pollBatchResults(batchId, 1) // Should be immediate
// Store results
await writeFile(
`results/${batchId}.json`,
JSON.stringify(results, null, 2)
)
// Notify team
await sendSlackMessage(`Batch ${batchId} complete: ${results.length} results`)
}
Cost Comparison Table
| Workflow | Interactive Cost | Batch Cost | Monthly Savings |
|---|---|---|---|
| 50 file code review | $1.125 | $0.56 | $12.43 |
| 100 test generations | $4.50 | $2.25 | $49.50 |
| 200 doc updates | $3.00 | $1.50 | $33.00 |
| Security scan (100 files) | $2.25 | $1.13 | $24.64 |
| Total (daily) | $10.88 | $5.44 | $119.57 |
Assumptions: Sonnet model, 5K input / 1K output tokens per request, 22 workdays/month.
Best Practices
1. Batch Similar Work Together
Group requests by type for easier result processing:
// Good: Homogeneous batches
const reviewBatch = await submitBatch(codeReviewRequests)
const testBatch = await submitBatch(testGenRequests)
// Avoid: Mixed batches that complicate result handling
const mixedBatch = await submitBatch([...reviews, ...tests, ...docs])
2. Use Meaningful Custom IDs
Include enough context to process results without the original request:
// Good: Rich context in ID
custom_id: `review-${sha.slice(0, 8)}-src/api/users.ts`
// Bad: Opaque ID
custom_id: `req-${Math.random()}`
3. Handle Partial Failures
Some requests may fail while others succeed. Always check result types:
for (const result of batchResults) {
if (result.result.type === 'succeeded') {
processSuccess(result)
} else if (result.result.type === 'errored') {
logError(result.custom_id, result.result.error)
// Optionally retry failed requests
failedRequests.push(extractOriginalRequest(result.custom_id))
} else if (result.result.type === 'expired') {
// Request timed out within Anthropic's processing
retryRequests.push(extractOriginalRequest(result.custom_id))
}
}
4. Set Appropriate Timeouts
Batch processing can take hours. Plan workflows accordingly:
- Simple requests: 1-4 hours typical
- Complex requests: 4-12 hours typical
- Maximum: 24 hours guaranteed
5. Implement Idempotency
Use deterministic custom_ids so reruns don’t create duplicates:
// Idempotent: Same input produces same ID
const customId = `review-${sha}-${filePath}`
// Check if already processed
if (await hasResult(customId)) {
console.log(`Already processed: ${customId}`)
return getCachedResult(customId)
}
Common Pitfalls
Pitfall 1: Polling Too Aggressively
Polling every second wastes resources and may trigger rate limits.
// Bad: Aggressive polling
while (!done) {
await checkStatus()
await sleep(1000) // Every second
}
// Good: Exponential backoff
let waitTime = 5000
while (!done) {
await checkStatus()
await sleep(waitTime)
waitTime = Math.min(waitTime * 1.5, 60000) // Cap at 1 minute
}
Pitfall 2: Not Storing Batch IDs
If your process crashes, you lose access to pending batches.
// Always persist batch IDs immediately after submission
const batchId = await submitBatch(requests)
await writeFile('.batch-state/pending.json', JSON.stringify({
batchId,
submittedAt: new Date().toISOString(),
requestCount: requests.length
}))
Pitfall 3: Ignoring Cost Savings from Caching
Batch API combined with prompt caching provides even greater savings:
// Structure requests for cache hits
const systemPrompt = await readFile('prompts/code-review.md', 'utf-8')
const requests = files.map(file => ({
custom_id: `review-${file.path}`,
params: {
model: 'claude-sonnet-4-5-20250929',
max_tokens: 2048,
messages: [{
role: 'user',
content: [
{
type: 'text',
text: systemPrompt, // Same for all requests - cacheable
cache_control: { type: 'ephemeral' }
},
{
type: 'text',
text: `Review: ${file.path}\n\n${file.content}`
}
]
}]
}
}))
Pitfall 4: Batch Too Large
While 10,000 requests per batch is allowed, processing time increases:
- 100 requests: 1-2 hours typical
- 1,000 requests: 4-8 hours typical
- 10,000 requests: 12-24 hours typical
Split large workloads into multiple batches for faster partial results.
When NOT to Use Batch API
Batch processing is wrong for:
- Interactive development: Real-time assistance needs immediate responses
- Debugging sessions: Quick back-and-forth requires low latency
- Time-sensitive operations: Deployments, incident response
- Small workloads: Overhead of batch submission not worth it for <10 requests
- Dependent requests: When request B needs output from request A
Related
- Cost Protection with Multi-Layer Timeouts – Budget controls for all API usage
- Model Switching Strategy – Choose cheaper models for simple tasks
- Prompt Caching Strategy – Reduce costs on repeated context
- CI/CD Agent Patterns – GitHub Actions integration patterns
References
- Anthropic Batch API Documentation
- Anthropic API Pricing (50% batch discount)
- GitHub Actions Scheduled Workflows

