Summary
LLM-generated code often fails on edge cases that example-based tests don’t cover. Property-based testing uses libraries like fast-check to generate hundreds of random inputs and verify invariants automatically, catching bugs that LLMs miss. Instead of writing ‘password must be 8+ chars’, you write properties that should always hold true, and the framework generates test cases to prove it.
The Problem
Example-based tests validate specific inputs but miss edge cases like empty strings, unicode characters, boundary values, and unexpected formats. LLMs generate code that works for provided examples but often fails on edge cases not explicitly tested. Writing exhaustive example tests is tedious and still incomplete.
The Solution
Property-based testing generates hundreds of random inputs and verifies that invariants (properties that should always be true) hold for all of them. Instead of testing ‘validatePassword(“12345678”) === true’, you test ‘all strings >= 8 chars should validate as true’. Frameworks like fast-check, Hypothesis (Python), and QuickCheck (Haskell) automate edge case discovery.
The Problem
When testing LLM-generated code, most developers write example-based tests:
test('password validation', () => {
expect(validatePassword('12345678')).toBe(true);
expect(validatePassword('short')).toBe(false);
expect(validatePassword('verylongpassword123')).toBe(true);
});
This approach has a critical flaw: you only test the examples you thought of.
What Example Tests Miss
Edge cases that break LLM-generated code:
- Boundary values: What about exactly 8 characters? 7 characters?
- Empty inputs: Empty string, null, undefined
- Unicode: Emojis, special characters, multi-byte unicode
- Whitespace: Leading/trailing spaces, tabs, newlines
- Special formats: HTML entities, encoded strings, injection attempts
- Numeric boundaries: MAX_INT, MIN_INT, infinity, NaN
- Type coercion: Numbers as strings, booleans as strings
- Array edge cases: Empty arrays, single-element arrays, very large arrays
Why LLMs Struggle with Edge Cases
LLMs generate code based on common patterns in training data. Edge cases are, by definition, uncommon, so LLMs often miss them:
// LLM generates this (seems correct):
function validatePassword(password: string): boolean {
return password.length >= 8;
}
// But fails for:
validatePassword('12345678') // Only checks length, not character types!
// Edge cases that break it:
validatePassword(' ') // 8 spaces - passes but shouldn't!
validatePassword('\u0000'.repeat(8)) // Null bytes
validatePassword('\n'.repeat(8)) // Newlines
Note: This article is being expanded with more property-based testing examples.
Related Concepts
- Test-Based Regression Patching – Write tests before fixing bugs
- Quality Gates as Information Filters – Tests filter out invalid solutions
- Test-Driven Prompting – Write tests before generating code

