Summary
Integrate LLM code review as a GitHub Action on every pull request to provide consistent, fast, and educational feedback. Catches common issues automatically, scales without bottlenecking senior developers, and costs only $0.10-0.50 per PR. Particularly valuable for first-time contributors and external PRs where immediate feedback accelerates onboarding.
The Problem
Manual code reviews are slow, inconsistent, and miss common patterns. Senior developers become bottlenecks reviewing routine PRs, while juniors wait days for feedback on basic issues like missing type safety, improper error handling, or style violations. External contributors may submit PRs that don’t follow project standards, wasting reviewer time on trivial issues.
The Solution
Run Claude Code as a GitHub Action on every PR, providing automated review comments focusing on code quality, security, performance, and best practices. The LLM has access to project context (CLAUDE.md files, schemas, standards) and can read tests, suggest improvements, and even run verification tools. Reviews complete in parallel with CI checks, providing immediate feedback while human reviewers focus on architecture and business logic.
The Problem
Code reviews are essential for quality, but they’re also a major bottleneck in modern development workflows.
The Manual Review Bottleneck
Scenario: Your team receives a PR from a first-time contributor:
// PR #247: Add user profile endpoint
export async function getUserProfile(req, res) {
const user = await db.users.findById(req.params.id);
res.json(user);
}
A senior developer reviews this 6 hours later and finds:
- ❌ No type safety (no TypeScript types)
- ❌ No input validation (SQL injection risk)
- ❌ No error handling (crashes if user not found)
- ❌ Direct database access (violates repository pattern)
- ❌ No authentication check (security issue)
- ❌ No tests
Result: Contributor waits 6 hours for feedback, then needs to revise and wait another 6 hours for re-review. Total time to merge: 2-3 days.
The Cost of Manual Reviews
For a team of 10 developers:
- Average PRs per day: 15
- Average review time: 15-30 minutes per PR
- Senior developer time spent reviewing: 4-7 hours/day
- Opportunity cost: 50% of senior developer capacity
For external contributors:
- First PR submission: Often violates basic standards
- Manual review identifies: 5-10 issues
- Contributor fixes: Submits new revision
- Cycle repeats: 2-3 times before merge
- Total time: 3-7 days from first submission to merge
Common Issues Missed or Delayed
Security issues:
- Missing input validation
- SQL injection vulnerabilities
- XSS vulnerabilities
- Authentication/authorization bypasses
Code quality issues:
- Missing type annotations
- Inconsistent error handling
- Poor naming conventions
- Code duplication
Performance issues:
- N+1 database queries
- Missing indexes
- Inefficient algorithms
- Memory leaks
Architecture violations:
- Bypassing abstraction layers
- Direct database access from controllers
- Mixing business logic with presentation
- Tight coupling
The Solution
Integrate Claude Code as a GitHub Action that automatically reviews every pull request, providing immediate, consistent feedback.
How It Works
# .github/workflows/claude-code-review.yml
name: Claude Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
claude-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: anthropics/claude-code-action@beta
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
direct_prompt: |
Review this PR focusing on:
- Code quality and best practices
- Potential bugs or security issues
- Performance considerations
- Test coverage
Be constructive and helpful.
What the LLM Reviews
1. Type Safety:
// ❌ LLM flags this
export async function getUserProfile(req, res) {
const user = await db.users.findById(req.params.id);
res.json(user);
}
// Comment from Claude:
// "Missing type annotations. Consider:
// - Add types for req, res parameters
// - Specify return type
// - Use Request/Response types from express"
2. Security Issues:
// ❌ LLM flags this
const query = `SELECT * FROM users WHERE id = ${req.params.id}`;
// Comment from Claude:
// "SQL injection vulnerability detected.
// Use parameterized queries instead:
// db.query('SELECT * FROM users WHERE id = ?', [req.params.id])"
3. Error Handling:
// ❌ LLM flags this
const user = await db.users.findById(id);
return user.email; // Crashes if user is null
// Comment from Claude:
// "Missing null check. If user not found, this will throw.
// Consider:
// if (!user) {
// return { success: false, error: 'User not found' };
// }"
4. Architecture Violations:
// ❌ LLM flags this (reading CLAUDE.md patterns)
export class UserService {
async getUser(id: string) {
return await supabase.from('users').select().eq('id', id);
}
}
// Comment from Claude:
// "Per CLAUDE.md, services should not access Supabase directly.
// Use UserRepository instead:
// constructor(private userRepo: UserRepository) {}
// return this.userRepo.findById(id);"
5. Missing Tests:
// ❌ LLM flags this
// New file: src/services/payment.ts
// No corresponding test file
// Comment from Claude:
// "No tests found for this service. Consider adding:
// - src/services/payment.test.ts
// - Test happy path (successful payment)
// - Test error cases (insufficient funds, network errors)"
Implementation
Step 1: Set Up GitHub Action
Create .github/workflows/claude-code-review.yml:
name: Claude Code Review
on:
pull_request:
types: [opened, synchronize]
# Only review certain file types
paths:
- '**.ts'
- '**.tsx'
- '**.js'
- '**.jsx'
- '**.py'
jobs:
claude-review:
runs-on: ubuntu-latest
# Skip for automated PRs (Dependabot, etc.)
if: |
github.event.pull_request.user.login != 'dependabot[bot]' &&
github.event.pull_request.user.login != 'renovate[bot]'
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for better context
- name: Claude Code Review
uses: anthropics/claude-code-action@beta
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
use_sticky_comment: true # Reuse same comment on updates
direct_prompt: |
Review this pull request thoroughly.
Focus areas:
1. **Type Safety**: Check TypeScript types, any usage
2. **Security**: Input validation, SQL injection, XSS, auth
3. **Error Handling**: Proper try/catch, error messages
4. **Performance**: N+1 queries, inefficient algorithms
5. **Architecture**: Layer boundaries, separation of concerns
6. **Testing**: Test coverage, edge cases
Context:
- Read all CLAUDE.md files for project patterns
- Check if changes follow established conventions
- Verify tests exist for new/modified code
Be constructive and specific. Provide code examples for suggestions.
Step 2: Add Claude Code OAuth Token
- Go to https://claude.com/claude-code/tokens
- Generate an OAuth token
- Add to GitHub repo secrets:
- Settings → Secrets and variables → Actions
- New repository secret:
CLAUDE_CODE_OAUTH_TOKEN
Step 3: Configure Review Prompts
Customize the review prompt based on your needs:
For First-Time Contributors
direct_prompt: |
${{ github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' &&
'Welcome! This is a first-time contribution. Review with encouragement:
- Point out issues gently with explanations
- Suggest improvements with code examples
- Link to relevant documentation
- Praise good patterns you see' ||
'Provide thorough review focusing on coding standards and best practices.' }}
For Different File Types
# API endpoints
- name: Review API Changes
if: contains(github.event.pull_request.changed_files, 'api/')
uses: anthropics/claude-code-action@beta
with:
direct_prompt: |
Review API endpoint changes:
- Input validation for all parameters
- Proper HTTP status codes
- Authentication/authorization checks
- Rate limiting considerations
- API documentation updates
# React components
- name: Review UI Changes
if: contains(github.event.pull_request.changed_files, 'components/')
uses: anthropics/claude-code-action@beta
with:
direct_prompt: |
Review React component changes:
- Accessibility (WCAG compliance)
- Performance (avoid unnecessary re-renders)
- Prop types/TypeScript interfaces
- Responsive design
- User experience patterns
Step 4: Allow Tool Usage (Optional)
Let Claude run verification tools:
- uses: anthropics/claude-code-action@beta
with:
claude_code_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
allowed_tools: |
Bash(npm run test)
Bash(npm run lint)
Bash(npm run typecheck)
direct_prompt: |
Review this PR and run verification:
1. Run tests: npm run test
2. Run linter: npm run lint
3. Run type checker: npm run typecheck
Report any failures and suggest fixes.
Step 5: Use Sticky Comments
Avoid comment spam on PR updates:
with:
use_sticky_comment: true # Edits same comment instead of creating new ones
Before (without sticky comments):
[Bot] Review comment (Commit 1abc)
[Bot] Review comment (Commit 2def)
[Bot] Review comment (Commit 3ghi)
# Result: 3 comments, cluttered PR
After (with sticky comments):
[Bot] Review comment (Updated for latest commit 3ghi)
# Result: 1 comment, updated in place
Advanced Patterns
Pattern 1: Conditional Reviews
Only review for certain author types:
jobs:
claude-review:
if: |
github.event.pull_request.author_association == 'FIRST_TIME_CONTRIBUTOR' ||
github.event.pull_request.author_association == 'CONTRIBUTOR'
# Skip for maintainers/members (they know the standards)
Pattern 2: Different Review Depth
# Light review for small changes
- name: Quick Review
if: github.event.pull_request.changed_files < 5
with:
direct_prompt: "Quick review focusing on obvious issues only."
# Deep review for large changes
- name: Thorough Review
if: github.event.pull_request.changed_files >= 5
with:
direct_prompt: "Comprehensive review including architecture, security, performance."
Pattern 3: Domain-Specific Reviews
# Security review for auth changes
- name: Security Review
if: contains(github.event.pull_request.changed_files, 'auth/')
with:
direct_prompt: |
Security-focused review:
- Authentication bypass vulnerabilities
- Session management issues
- Token validation
- Rate limiting
- Audit logging
# Performance review for database changes
- name: Performance Review
if: contains(github.event.pull_request.changed_files, 'database/')
with:
direct_prompt: |
Performance-focused review:
- Missing indexes
- N+1 query patterns
- Inefficient joins
- Large data fetching
- Connection pooling
Pattern 4: Context-Aware Reviews
direct_prompt: |
Review this PR using project context:
1. Read CLAUDE.md files to understand:
- Architecture patterns
- Coding standards
- Error handling conventions
- Testing requirements
2. Check if changes follow:
- Existing naming conventions (grep for similar patterns)
- Established patterns (find similar implementations)
- Project structure (ensure files in correct locations)
3. Verify integration:
- Tests exist and pass
- Types are correct
- Documentation is updated
Cost Analysis
Per-PR Cost
Small PR (100 lines changed):
- Input tokens: ~5K (PR diff + context)
- Output tokens: ~500 (review comments)
- Cost: ~$0.10
Medium PR (500 lines changed):
- Input tokens: ~15K
- Output tokens: ~1K
- Cost: ~$0.30
Large PR (1000+ lines changed):
- Input tokens: ~30K
- Output tokens: ~2K
- Cost: ~$0.50
Monthly Cost Projections
Small team (5 developers, 50 PRs/month):
50 PRs × $0.20 average = $10/month
Medium team (20 developers, 200 PRs/month):
200 PRs × $0.20 average = $40/month
Large team (100 developers, 1000 PRs/month):
1000 PRs × $0.20 average = $200/month
ROI Analysis
Time saved per PR:
- Manual review time: 15-30 minutes
- LLM review time: 2-3 minutes (automated)
- Time saved: 12-27 minutes per PR
For a team of 20 developers (200 PRs/month):
Time saved: 200 PRs × 20 min = 4000 min/month = 66 hours
Cost saved (at $100/hour): $6,600/month
LLM cost: $40/month
ROI: ($6,600 - $40) / $40 = 16,400% return
Value beyond time savings:
- Faster feedback for contributors (hours → minutes)
- Consistent review quality (no tired/rushed reviews)
- Educational feedback for juniors (detailed explanations)
- Catch issues before human review (better use of senior time)
Best Practices
1. Focus on High-Value Reviews
Use LLM reviews for:
- ✅ First-time contributors (always)
- ✅ External contributors (high value)
- ✅ Junior developers (educational)
- ✅ Security-critical changes (double-check)
- ✅ Large refactors (catch edge cases)
Skip LLM reviews for:
- ❌ Senior developer minor fixes (waste of review)
- ❌ Auto-generated PRs (Dependabot, Renovate)
- ❌ Documentation-only changes (low value)
- ❌ Merge commits, version bumps
2. Provide Rich Context
Ensure Claude has access to:
direct_prompt: |
Available context:
1. CLAUDE.md files (architecture, patterns, standards)
2. schemas/ directory (data models)
3. tests/ directory (expected behavior)
4. .eslintrc.js (code style rules)
5. tsconfig.json (TypeScript config)
Use this context to ensure PR aligns with project standards.
3. Be Specific in Prompts
❌ Vague prompt:
direct_prompt: "Review this code"
✅ Specific prompt:
direct_prompt: |
Review focusing on:
1. Type safety (no 'any', proper return types)
2. Security (input validation, SQL injection)
3. Error handling (try/catch, proper error messages)
4. Tests (coverage for new/modified code)
5. Architecture (follows repository pattern from CLAUDE.md)
Provide code examples for each suggestion.
4. Calibrate Review Tone
Adjust tone based on audience:
For beginners (encouraging):
direct_prompt: |
This is a first-time contributor.
- Be encouraging and welcoming
- Explain *why* changes are needed
- Provide code examples
- Link to documentation
- Praise good patterns
For experienced devs (concise):
direct_prompt: |
Experienced contributor.
- Focus on critical issues only
- Be concise
- Assume familiarity with patterns
5. Combine with Human Review
LLM reviews augment, not replace, human reviews:
LLM Review (automated, immediate):
├─ Syntax, types, basic errors
├─ Security vulnerabilities
├─ Code style, linting
└─ Test coverage
Human Review (manual, later):
├─ Architecture decisions
├─ Business logic correctness
├─ API design
└─ Product requirements
Workflow:
- PR opened → LLM reviews immediately (2 min)
- Contributor fixes LLM-identified issues (30 min)
- Human reviewer sees cleaner PR → faster review (10 min vs 30 min)
- Result: Faster time-to-merge, higher quality
6. Monitor Review Quality
Track metrics:
interface ReviewMetrics {
totalPRs: number;
llmReviewsRun: number;
issuesFound: number;
falsePositives: number;
timeToFirstReview: number; // seconds
contributorSatisfaction: number; // 1-5 rating
}
const metrics: ReviewMetrics = {
totalPRs: 247,
llmReviewsRun: 198,
issuesFound: 542,
falsePositives: 23, // 4.2% false positive rate
timeToFirstReview: 127, // ~2 minutes
contributorSatisfaction: 4.3,
};
Target metrics:
- LLM review completion: <5 minutes
- False positive rate: <10%
- Issues found per PR: 2-5
- Contributor satisfaction: >4.0/5
Common Pitfalls
❌ Pitfall 1: Over-Reliance on LLM
Problem: Skipping human review entirely
Solution: LLM catches routine issues, humans review architecture/logic
❌ Pitfall 2: Generic Prompts
Problem: “Review this code” produces generic feedback
Solution: Specific, context-aware prompts with examples
❌ Pitfall 3: Reviewing Everything
Problem: Running LLM review on trivial PRs (typo fixes, version bumps)
Solution: Filter by PR type, author, file changes
❌ Pitfall 4: Ignoring False Positives
Problem: LLM suggests invalid changes, confusing contributors
Solution: Monitor feedback, refine prompts, add context to CLAUDE.md
❌ Pitfall 5: No Tool Access
Problem: LLM can’t run tests/linting, misses issues
Solution: Grant controlled tool access (tests, linting only)
Integration with Other Patterns
LLM Review + Hierarchical CLAUDE.md
Claude reads CLAUDE.md files to understand project patterns:
Root CLAUDE.md: Global architecture, patterns
├─ api/CLAUDE.md: API conventions, error handling
├─ database/CLAUDE.md: Query patterns, migrations
└─ ui/CLAUDE.md: Component patterns, accessibility
LLM reviews PR against relevant CLAUDE.md files
LLM Review + Custom ESLint Rules
Claude verifies compliance with custom rules:
// .eslintrc.js
rules: {
'no-direct-supabase-access': 'error', // Custom rule
}
// LLM checks for violations:
// "This code accesses Supabase directly, violating
// 'no-direct-supabase-access' rule. Use repository pattern instead."
LLM Review + Test-Driven Prompting
Claude verifies tests exist for new code:
direct_prompt: |
Check test coverage:
1. Find all new/modified functions
2. Verify corresponding test files exist
3. Check tests cover happy path + edge cases
4. Report missing tests with suggested test cases
Real-World Example
Before LLM Review
PR submitted by first-time contributor:
// src/api/users.ts
export async function createUser(req, res) {
const user = await db.users.insert({
email: req.body.email,
password: req.body.password,
});
res.json(user);
}
Human reviewer (6 hours later):
“Several issues:
- No types
- Password stored in plaintext (security risk!)
- No input validation
- No error handling
- No tests
Please fix and resubmit.”
Contributor (8 hours later):
Fixes and resubmits, but still missing tests.
Total time: 3 days to merge
After LLM Review
PR submitted by first-time contributor (same code)
Claude review (2 minutes later):
“Welcome! Thanks for contributing. A few suggestions:
1. Type Safety 🔒
// Add types: import { Request, Response } from 'express'; export async function createUser( req: Request, res: Response ): Promise<void>2. Security Issue ⚠️ Critical
Password is stored in plaintext. Use bcrypt:import bcrypt from 'bcrypt'; const hashedPassword = await bcrypt.hash(req.body.password, 10);3. Input Validation ✅
Validate email format and password strength:import { z } from 'zod'; const schema = z.object({ email: z.string().email(), password: z.string().min(8), }); const { email, password } = schema.parse(req.body);4. Error Handling 🐛
Wrap in try/catch:try { // ... user creation } catch (error) { res.status(500).json({ error: 'Failed to create user' }); }5. Tests 🧪
Add tests insrc/api/users.test.ts:
- Test successful user creation
- Test duplicate email rejection
- Test invalid email format
- Test weak password rejection
See our Contributing Guide for more details!”
Contributor (1 hour later):
Fixes all issues, adds tests, resubmits.
Human reviewer (2 hours later):
Quick approval (LLM caught all routine issues).
Total time: Same day merge ✅
Measuring Success
Key Metrics
1. Time to First Review
Before: 4-8 hours (waiting for human reviewer)
After: 2-5 minutes (automated LLM review)
Improvement: 95%+ faster
2. Time to Merge
Before: 2-3 days (multiple review cycles)
After: 4-8 hours (issues caught early)
Improvement: 70% faster
3. Review Coverage
Before: 60% of PRs reviewed within 24h
After: 100% of PRs reviewed within 5min
Improvement: Universal coverage
4. Issue Detection
LLM catches per PR: 3-5 routine issues
Human reviewer time saved: 10-15 min per PR
5. Contributor Satisfaction
Survey results (1-5 scale):
- Speed of feedback: 4.7/5
- Quality of feedback: 4.3/5
- Helpfulness: 4.5/5
Conclusion
LLM code review in CI is a high-leverage automation that:
✅ Provides instant feedback (minutes vs hours)
✅ Catches routine issues automatically
✅ Scales infinitely (no bottleneck on seniors)
✅ Educates contributors with detailed explanations
✅ Costs pennies ($0.10-0.50 per PR)
✅ Improves consistency (no tired/rushed reviews)
Best for:
- First-time contributors (always)
- External contributors (high value)
- Security-critical changes (extra verification)
- Teams with review bottlenecks
Cost: $10-200/month depending on team size
ROI: 100-1000x (time saved vs cost)
Integration: 10 minutes to set up GitHub Action
Result: Faster PRs, happier contributors, better code quality.
Related Concepts
- Verification Sandwich Pattern: LLM review as a quality gate layer
- Claude Code Hooks Quality Gates: Similar automation for local development
- Hierarchical Context Patterns: Provides context for reviews
- Custom ESLint Rules for Determinism: Deterministic checks LLM can verify
Related Concepts
- Verification Sandwich Pattern
- Claude Code Hooks Quality Gates
- Hierarchical Context Patterns
- Custom ESLint Rules for Determinism
References
- Claude Code GitHub Action – Official Claude Code GitHub Action for CI/CD integration
- GitHub Actions Documentation – Guide to setting up GitHub Actions workflows

