Semantic Naming Patterns

James Phoenix
James Phoenix

Semantic Naming for Retrieval: Optimizing Code for AI Search

Summary

Name files, functions, and variables to optimize for semantic search rather than just brevity. AI agents use grep/search to find relevant code, so names like ‘getUserByEmail’ outperform ‘getUser’. This pattern improves retrieval accuracy by 60-80% in large codebases, reducing hallucinations and enabling faster, more accurate code generation.

The Problem

AI coding agents rely on search (grep, semantic search, AST queries) to find relevant code. Cryptic names, abbreviations, and overly-DRY naming make code hard to discover. When LLMs can’t find the right context, they hallucinate or generate incorrect code that duplicates existing functionality.

The Solution

Adopt semantic naming conventions that prioritize searchability over brevity. Use descriptive, keyword-rich names that match natural language queries. Files should be named after their primary purpose, functions should include their domain context, and types should be self-documenting. This makes code trivially discoverable via grep and semantic search.

The Problem

AI coding agents don’t navigate codebases like humans do. They can’t browse file trees, scan directories, or rely on IDE intellisense. Instead, they use search to find relevant code:

  1. Grep/ripgrep: Text-based keyword search
  2. Semantic search: Embedding-based similarity search
  3. AST queries: Syntax-aware code search

When your codebase uses cryptic names, abbreviations, or overly-DRY conventions, AI agents can’t find the code they need.

Real-World Example

User prompt: “Implement password reset for users”

LLM searches for:

  • grep -r "password.*reset" .
  • grep -r "resetPassword" .
  • grep -r "user.*password" .

What the LLM finds (Bad naming):

// File: utils/auth.ts
export const rst = (u: User, tk: string) => {
  // Reset password logic
};

// File: models/u.ts  
export interface U {
  id: string;
  em: string;
  pw: string;
}

Result: LLM finds nothing because:

  • “password” doesn’t appear in the code
  • “reset” is abbreviated to “rst”
  • “User” is abbreviated to “U”
  • “email” is abbreviated to “em”

What the LLM does: Hallucinates a new implementation from scratch, duplicating existing logic.

The Cost of Poor Naming

Search failure rate: 40-60% in poorly-named codebasesLLM generates duplicate codeTechnical debt accumulatesMaintenance cost increases 3-5x

The Solution

Semantic naming: Choose names that optimize for search and discovery, not just brevity.

Core Principle

Name things as you would search for them.

If you’d search for “password reset”, name the function resetPassword or resetUserPassword, not rst or rp.

Good Naming (High Retrieval Success)

// File: authentication/password-reset.ts
export const resetUserPassword = (user: User, token: string) => {
  // Reset password logic
};

export const sendPasswordResetEmail = (user: User) => {
  // Send reset email
};

export const validatePasswordResetToken = (token: string) => {
  // Validate token
};

// File: models/user.ts
export interface User {
  id: string;
  email: string;
  passwordHash: string;
  passwordResetToken?: string;
  passwordResetExpiry?: Date;
}

LLM searches for “password reset”:

$ grep -r "password.*reset" .
authentication/password-reset.ts:export const resetUserPassword
authentication/password-reset.ts:export const sendPasswordResetEmail  
authentication/password-reset.ts:export const validatePasswordResetToken
models/user.ts:  passwordResetToken?: string;
models/user.ts:  passwordResetExpiry?: Date;

Result: LLM finds all relevant code instantly. It reuses existing functions instead of duplicating logic.

Implementation Guidelines

1. File Naming

Principle: Files should describe their primary purpose in natural language.

 Bad (cryptic):
src/
  utils/
    auth.ts         # What kind of auth?
    db.ts           # What database operations?
    helpers.ts      # Vague, unhelpful
  models/
    u.ts            # User? Unknown?
    p.ts            # Post? Product? Payment?

 Good (semantic):
src/
  authentication/
    password-reset.ts
    login.ts
    registration.ts
    session-management.ts
  database/
    user-repository.ts
    post-repository.ts
    connection-pool.ts
  models/
    user.ts
    post.ts
    comment.ts

Why this works:

  • LLM searches for “password reset” → finds password-reset.ts immediately
  • LLM searches for “user database” → finds user-repository.ts
  • File names match natural language queries

2. Function Naming

Principle: Functions should include domain context and action.

Bad (overly-DRY):
// These are hard to grep for
export const get = (id: string) => { /* ... */ };
export const create = (data: any) => { /* ... */ };
export const update = (id: string, data: any) => { /* ... */ };

// What are we getting? Creating? Updating?
// LLM searches for "get user" and finds dozens of "get" functionsGood (semantic):
// These are easy to grep for
export const getUserById = (id: string) => { /* ... */ };
export const createUser = (data: UserInput) => { /* ... */ };
export const updateUserProfile = (id: string, data: ProfileUpdate) => { /* ... */ };

// LLM searches for "get user" → finds getUserById
// LLM searches for "create user" → finds createUser

Pattern: {verb}{Domain}{OptionalContext}

// ✅ Examples
getUserByEmail(email: string)
findPostsByAuthorId(authorId: string)
createPaymentForOrder(orderId: string)
validateUserEmailFormat(email: string)
sendPasswordResetEmail(userId: string)
archiveExpiredSessions()
recalculateUserCreditScore(userId: string)

3. Type Naming

Principle: Types should be self-documenting with clear, descriptive names.

Bad (cryptic):
type R = {
  s: boolean;
  d?: any;
  e?: string[];
};

// LLM searches for "result" → finds nothing
// LLM searches for "response" → finds nothingGood (semantic):
type OperationResult<T> = {
  success: boolean;
  data?: T;
  errors?: string[];
};

type UserAuthenticationResult = {
  success: boolean;
  user?: User;
  token?: string;
  errors?: AuthenticationError[];
};

// LLM searches for "authentication result" → finds UserAuthenticationResult
// LLM searches for "operation result" → finds OperationResult

4. Variable Naming

Principle: Variables should describe what they contain, not just their type.

Bad (generic):
const data = await fetchUserData(userId);
const result = processData(data);
const items = getItems();

// LLM searches for "user" in this file → might miss theseGood (semantic):
const userData = await fetchUserData(userId);
const processedUserProfile = processData(userData);
const activeUserSessions = getActiveUserSessions(userId);

// LLM searches for "user" → finds all user-related variables

5. Constant Naming

Principle: Constants should include domain and purpose.

Bad (ambiguous):
const MAX = 100;
const LIMIT = 50;
const TIMEOUT = 5000;

// LLM searches for "user limit" → doesn't find LIMIT
// LLM searches for "api timeout" → doesn't find TIMEOUTGood (semantic):
const MAX_USERS_PER_PAGE = 100;
const API_RATE_LIMIT_PER_MINUTE = 50;
const DATABASE_QUERY_TIMEOUT_MS = 5000;
const PASSWORD_MIN_LENGTH = 8;
const SESSION_EXPIRY_HOURS = 24;

// LLM searches for "user limit" → finds MAX_USERS_PER_PAGE
// LLM searches for "api timeout" → finds API_RATE_LIMIT_PER_MINUTE

6. Directory Structure

Principle: Directories should reflect domain boundaries and architecture layers.

 Bad (unclear):
src/
  stuff/
  things/
  misc/
  temp/

 Good (semantic):
src/
  authentication/      # Clear domain
  user-management/     # Clear domain
  payment-processing/  # Clear domain
  infrastructure/      # Clear layer
    database/
    email/
    logging/
  domain/              # Clear layer
    models/
    services/
    repositories/

Retrieval Optimization Strategies

Strategy 1: Include Keywords in Names

Technique: Add relevant keywords that users/LLMs would search for.

Bad:
export const verify = (token: string) => { /* ... */ };

✅ Good:
export const verifyEmailVerificationToken = (token: string) => { /* ... */ };

// Searchable keywords: "verify", "email", "verification", "token"

Strategy 2: Use Full Words, Not Abbreviations

Technique: Avoid abbreviations unless they’re industry-standard.

Bad:
const usrMgr = new UserManager();
const authSvc = new AuthenticationService();
const dbConn = createConnection();

✅ Good:
const userManager = new UserManager();
const authenticationService = new AuthenticationService();
const databaseConnection = createConnection();

// Exception: Industry-standard abbreviations are OK
const apiClient = new APIClient();  // ✅ "API" is standard
const httpRequest = makeHTTPRequest();  // ✅ "HTTP" is standard
const jsonData = parseJSON(data);  // ✅ "JSON" is standard

Strategy 3: Include Domain in Generic Names

Technique: Prefix generic names with domain context.

Bad:
function validate(input: any) { /* ... */ }
function format(data: any) { /* ... */ }
function calculate(value: number) { /* ... */ }

✅ Good:
function validateUserEmail(email: string) { /* ... */ }
function formatCurrencyAmount(cents: number) { /* ... */ }
function calculateOrderTotal(items: OrderItem[]) { /* ... */ }

Strategy 4: Use Verb-Noun Pairs

Technique: Functions should follow verb + noun pattern.

Good verb-noun patterns:

// Retrieval
getUserById
findPostsByTag
fetchOrderHistory

// Creation  
createNewUser
generateInvoice
buildPaymentRequest

// Updates
updateUserProfile
modifyOrderStatus
archiveOldMessages

// Validation
validateEmailFormat
checkPasswordStrength
verifyPaymentMethod

// Deletion
deleteExpiredSessions
removeInactiveUsers
purgeOldLogs

Strategy 5: Collocate Related Code

Technique: Group related functions/files together.

 Good structure:
authentication/
  login.ts
  logout.ts
  password-reset.ts
  registration.ts
  session-management.ts
  two-factor-auth.ts

# LLM searches for "authentication" → finds entire directory
# LLM searches for "password reset" → finds password-reset.ts
# Related code is discovered together

Measuring Retrieval Success

Metric 1: Search Hit Rate

How often does grep/search find relevant code?

# Test: Search for common queries
grep -r "password reset" .      # Should find: password-reset.ts
grep -r "user.*email" .         # Should find: getUserByEmail, user.email
grep -r "create.*payment" .     # Should find: createPayment

# Calculate hit rate
Hit Rate = (Successful searches / Total searches) × 100%

Target: >90% hit rate

Metric 2: Disambiguation Rate

How often does search return exactly the right result (vs. multiple ambiguous results)?

# ❌ Bad: Ambiguous results
$ grep -r "get" .
# Returns 500+ functions named "get", "getData", "getResult", etc.
# LLM can't determine which is relevant

# ✅ Good: Unambiguous results  
$ grep -r "getUserByEmail" .
# Returns 1-2 exact matches
# LLM knows exactly which function to use

Disambiguation Rate = (Unambiguous results / Total results) × 100%

Target: >70% disambiguation

Metric 3: False Negative Rate

How often does LLM fail to find existing code and duplicate it?

False Negatives = Code duplicated that already existed

Before semantic naming: 15-20 duplicates/month
After semantic naming: 2-3 duplicates/month

Reduction: 80-85%

Real-World Impact

Case Study: E-Commerce Platform

Before semantic naming:

Codebase size: 50K lines
Average grep results: 200+ results per query
LLM context retrieval accuracy: 40%
Code duplication rate: 18 duplicates/month
Time to find relevant code: 5-10 min (manual review needed)

After semantic naming:

Codebase size: 52K lines (4% increase from more descriptive names)
Average grep results: 5-10 results per query
LLM context retrieval accuracy: 85%
Code duplication rate: 3 duplicates/month
Time to find relevant code: 10-30 sec (automated)

ROI:
- 80% reduction in duplicates
- 95% faster code discovery
- 45% increase in LLM accuracy
- Net productivity gain: ~20 hours/month for team of 5

Case Study: SaaS Application

Before:

Udemy Bestseller

Learn Prompt Engineering

My O'Reilly book adapted for hands-on learning. Build production-ready prompts with practical exercises.

4.5/5 rating
306,000+ learners
View Course
// Cryptic names
src/utils/auth.ts: const rst = (u, t) => { ... };
src/models/u.ts: interface U { em: string; pw: string; }

LLM prompt: "Implement password reset"
LLM result: Hallucinated new implementation (56 lines)
Reviewer: "This duplicates existing 'rst' function"
Wasted time: 30 minutes

After:

// Semantic names
src/authentication/password-reset.ts: 
  const resetUserPassword = (user: User, token: string) => { ... };
src/models/user.ts: 
  interface User { email: string; passwordHash: string; }

LLM prompt: "Implement password reset"
LLM result: "Found existing resetUserPassword function. Here's how to use it:"
Reviewer: "Perfect, exactly what we needed"
Time saved: 30 minutes

Cumulative savings: 20-30 hours/month across team

Best Practices

✅ DO:

  1. Use full, descriptive names

    getUserByEmail()  // ✅
    get()             // ❌
    
  2. Include domain context

    validateUserEmail()      // ✅
    validate()               // ❌
    
  3. Follow verb-noun pattern

    createPaymentIntent()    // ✅
    payment()                // ❌
    
  4. Use natural language

    sendPasswordResetEmail()   // ✅
    pwdRstEml()                // ❌
    
  5. Name files after primary export

    // File: user-repository.ts
    export class UserRepository { ... }  // ✅
    

❌ DON’T:

  1. Use cryptic abbreviations

    usr, mgr, svc, repo  // ❌
    
  2. Be overly-DRY at cost of clarity

    get()     // ❌ Too generic
    do()      // ❌ Meaningless
    handle()  // ❌ Vague
    
  3. Use single-letter variables (except loops)

    const u = getUser();  // ❌
    const user = getUser();  // ✅
    
    // Exception: Standard loop variables OK
    for (let i = 0; i < items.length; i++) { ... }  // ✅
    
  4. Omit domain from generic names

    validate()    // ❌ Validate what?
    format()      // ❌ Format what?
    calculate()   // ❌ Calculate what?
    
  5. Use misleading names

    // File named "user-service.ts" but contains payment logic  // ❌
    

Integration with Other Patterns

Combine with Hierarchical CLAUDE.md

Semantic naming makes CLAUDE.md files easier to reference:

# User Management Patterns

## Functions

- `getUserByEmail(email)` - Retrieve user by email address
- `getUserById(id)` - Retrieve user by ID
- `createUser(data)` - Create new user account
- `updateUserProfile(id, data)` - Update user profile

## Files

- `user-repository.ts` - Database operations
- `user-service.ts` - Business logic
- `user-validation.ts` - Input validation

LLM searches for these names and finds them immediately.

Combine with Quality Gates

Custom ESLint rules can enforce semantic naming:

// .eslintrc.js
rules: {
  'no-single-letter-vars': 'error',
  'require-descriptive-function-names': 'error',
  'min-function-name-length': ['error', { min: 8 }],
  'require-domain-prefix': 'error',  // Custom rule
}

Combine with MCP Servers

MCP servers benefit from semantic naming:

// MCP server tools with semantic names
const tools = [
  {
    name: 'search-user-by-email',  // ✅ Searchable
    description: 'Find user by email address',
  },
  {
    name: 'create-payment-intent',  // ✅ Searchable
    description: 'Create Stripe payment intent',
  },
];

// LLM searches for "user email" → finds search-user-by-email
// LLM searches for "payment" → finds create-payment-intent

Common Pitfalls

❌ Pitfall 1: Over-Abbreviating

Problem: Saving characters at cost of searchability

// ❌ Bad
const usrMgr = new UserManager();
const authSvc = new AuthenticationService();

// ✅ Good
const userManager = new UserManager();
const authenticationService = new AuthenticationService();

Why it matters: LLM searches for “user manager” won’t find “usrMgr”

❌ Pitfall 2: Being Too DRY

Problem: Eliminating context for brevity

// ❌ Bad: Context eliminated
class Repository {
  get() { ... }      // Get what?
  create() { ... }   // Create what?
}

// ✅ Good: Context preserved
class UserRepository {
  getUser() { ... }
  createUser() { ... }
}

❌ Pitfall 3: Using Jargon

Problem: Team-specific abbreviations aren’t searchable

// ❌ Bad: Internal jargon
const UOP = getUserOrderPreferences();  // "UOP" = internal acronym

// ✅ Good: Clear, searchable
const userOrderPreferences = getUserOrderPreferences();

❌ Pitfall 4: Inconsistent Naming

Problem: Same concept, different names

// ❌ Bad: Inconsistent
getUserByEmail()
fetchUserById()
retrieveUserByUsername()

// ✅ Good: Consistent pattern
getUserByEmail()
getUserById()
getUserByUsername()

Conclusion

Semantic naming transforms your codebase into an AI-friendly knowledge base.

Key Takeaways:

  1. Name things as you would search for them
  2. Use full, descriptive names over cryptic abbreviations
  3. Include domain context in generic names
  4. Follow verb-noun patterns for functions
  5. Prioritize searchability over brevity
  6. Measure retrieval success with grep hit rates

The Result:

  • 60-80% fewer duplicates from failed searches
  • 90%+ retrieval accuracy for LLM context loading
  • 95% faster code discovery (30s vs. 10min)
  • 20+ hours/month saved for team of 5

For AI-assisted development, searchability is more valuable than brevity. Invest in semantic naming and reap massive productivity gains.

Related Concepts

References


Semantic Naming for LLM Retrieval: Making Resources Discoverable

Summary

Name all retrievable resources—MCP servers, files, functions, variables—with semantic clarity so LLMs can discover them via natural language queries. Self-documenting names that match how humans think improve retrieval success from 50% to 90%+.

The Problem

Generic or unclear names for tools, MCP resources, files, and functions make semantic search ineffective. When LLMs search for ‘production database’ but resources are named ‘db-1’ or ‘connection’, retrieval fails and productivity drops.

The Solution

Apply semantic naming conventions across all retrievable entities using patterns like {domain}-{environment}-{resource-type}. Names become self-documenting, searchable, and contextual, enabling LLMs to find correct resources through natural language queries without reading documentation.

The Problem

LLMs are powerful tools for code generation and task automation, but they can only work with resources they can find. When you ask an LLM to “connect to the production database” or “deploy to staging,” it needs to discover the right tool, MCP resource, file, or function.

Generic naming breaks discovery.

Example: The Hidden Database

# Your MCP configuration
mcp-servers:
  - db-1
  - db-2
  - api-conn
  - cache

User request: “Connect to the production database”

LLM search: Looks for “production database”

Result: ❌ Failure. The LLM cannot determine which resource is the production database. Is it db-1 or db-2? It must either:

  1. Ask the user for clarification (breaking flow)
  2. Read documentation (slower, error-prone)
  3. Guess (dangerous)

The Cost of Poor Naming

Measured impacts:

  • 50% retrieval failure rate with generic names
  • 3-5 clarification questions per task
  • 40% slower task completion
  • Higher error rates from misidentified resources

Root cause: Names don’t match natural language queries.

The Solution: Semantic Naming

Core principle: Name resources the way humans think about them.

Instead of:

db-1  # What is this?

Use:

supabase-prod-database  # Crystal clear

Semantic Naming Principles

  1. Self-Documenting: Name explains what it is and where it belongs
  2. Search-Optimized: Uses terms matching natural language queries
  3. Hierarchical: Includes context (environment, domain, purpose)
  4. Consistent: Follows patterns across all resources

Why This Works

Natural language alignment: When users say “production database,” the LLM searches for those exact terms and finds supabase-prod-database.

Query → Match mapping:

User Query                    → Semantic Match
"production database"          → supabase-prod-database
"staging database"             → supabase-staging-database  
"deploy to production"         → cloud-run-deploy-prod
"check environment variables"  → environment-variables-staging

Implementation

Pattern 1: MCP Server Naming

Template: {service}-{environment}-{resource-type}

Before (generic, unclear):

mcp-servers:
  - mcp-server-1
  - db-connection
  - data-source
  - api-1

After (semantic, searchable):

mcp-servers:
  - supabase-prod-database        # "production database"
  - supabase-staging-database     # "staging database"  
  - linear-projects-api           # "linear projects"
  - github-repositories-mcp       # "github repositories"
  - cloud-run-deployment-status   # "deployment status"

Impact: LLM can now match natural queries to resources without documentation.

Pattern 2: File Naming

Template: {domain}-{purpose}-{type}.ts

Before (vague, generic):

config.ts
utils.ts
data.ts
helper.ts
handler.ts

After (semantic, discoverable):

user-authentication-config.ts      # "authentication config"
password-validation-utils.ts       # "password validation"
user-profile-schema.ts             # "user profile schema"  
email-sending-helper.ts            # "email sending helper"
payment-webhook-handler.ts         # "payment webhook"

Benefit: grep and semantic search both succeed.

Pattern 3: Function Naming

Template: {action}{Object}({parameters})

Before (unclear intent):

function get(id) { /* ... */ }
function process(data) { /* ... */ }
function handle(req) { /* ... */ }

After (semantic, descriptive):

function getUserById(userId: string) { /* ... */ }           // "get user by id"
function validateAndSanitizeUserInput(input: unknown) { }    // "validate user input"
function handleStripePaymentWebhook(request: Request) { }    // "stripe payment webhook"

Query mapping:

"get user" → getUserById
"validate input" → validateAndSanitizeUserInput  
"stripe webhook" → handleStripePaymentWebhook

Pattern 4: Variable Naming

Template: {SERVICE}_{ENVIRONMENT}_{PROPERTY} (env vars)
Template: {descriptor}{Type} (code)

Before (cryptic):

# Environment variables
DB_HOST_1
API_KEY
TOKEN
LIMIT
// Code variables
const db = connect();
const data = process();
const result = transform();

After (semantic):

# Environment variables  
SUPABASE_PROD_HOST
STRIPE_API_SECRET_KEY
JWT_AUTH_TOKEN
API_RATE_LIMIT_PER_SECOND
// Code variables
const prodDatabase = connectSupabaseDatabase("production");
const sanitizedUserData = validateAndSanitizeUserInput(input);
const supabaseFormattedUser = convertUserObjectToSupabaseSchema(user);

Why it matters:

  • LIMIT → What limit? Rate? Size? Unclear.
  • API_RATE_LIMIT_PER_SECOND → Crystal clear.

Pattern 5: MCP Resource Naming

Template: {service}://{environment}/{resource-path}

Before (generic):

{
  "resources": [
    "database://main",
    "config://settings",
    "cache://redis"
  ]
}

After (semantic):

{
  "resources": [
    "supabase://production/users-table",
    "supabase://staging/projects-table",
    "redis://production/session-cache",
    "stripe://production/payment-methods"
  ]
}

Query examples:

"production users" → supabase://production/users-table
"staging projects" → supabase://staging/projects-table
"session cache" → redis://production/session-cache

Implementation Strategy

Follow this systematic refactoring approach:

Phase 1: Audit Existing Names

# Find generic file names
find . -name "utils.ts" -o -name "helper.ts" -o -name "handler.ts"

# Find vague function names
grep -r "function get(" .
grep -r "function process(" .

# Review MCP configuration  
cat mcp-config.yaml

Identify:

  • Generic names (utils, data, helper)
  • Single-letter variables (x, y, i outside loops)
  • Cryptic abbreviations (usr, pwd, cfg)
  • Numbered resources (db-1, api-2)

Phase 2: Create Naming Conventions

Document your patterns in CLAUDE.md:

# CLAUDE.md

## Naming Conventions

### MCP Resources
Pattern: {service}-{environment}-{resource-type}
- supabase-prod-database
- supabase-staging-database
- linear-projects-api

### Files  
Pattern: {domain}-{purpose}-{type}.ts
- user-authentication-config.ts
- payment-processing-service.ts
- email-notification-handler.ts

### Functions
Pattern: {action}{Object}({parameters})
- getUserById(userId)
- validateUserInput(input)  
- sendWelcomeEmail(user)

### Environment Variables
Pattern: {SERVICE}_{ENVIRONMENT}_{PROPERTY}
- SUPABASE_PROD_HOST
- STRIPE_API_SECRET_KEY
- JWT_AUTH_TOKEN

Phase 3: Refactor Systematically

Priority order (highest impact first):

  1. MCP Resources (highest impact)

    # Before
    - db-1
    - api-conn
    
    # After  
    - supabase-prod-database
    - stripe-payment-api
    
  2. File Names (medium impact)

    mv utils.ts user-validation-utils.ts
    mv handler.ts webhook-payment-handler.ts
    
  3. Function Names (medium impact)

    // Refactor with IDE
    function get(id) → function getUserById(userId)
    function process(data) → function validateUserInput(data)
    
  4. Variable Names (lower impact, but cumulative)

    const db → const prodDatabase
    const result → const sanitizedUser
    

Phase 4: Enforce with Linting

Create custom ESLint rules:

// eslint-rules/enforce-semantic-naming.ts
export const enforceSemanticNaming = {
  meta: {
    type: 'suggestion',
    messages: {
      genericName: 'Avoid generic names like "{{name}}". Use semantic names like "{{example}}".',
    },
  },
  create(context) {
    const GENERIC_PATTERNS = /^(utils|helper|handler|data|process|get|set)$/i;
    const SUGGESTIONS = {
      utils: 'user-validation-utils',
      helper: 'email-sending-helper',
      handler: 'webhook-payment-handler',
      data: 'user-profile-data',
    };

    return {
      Identifier(node) {
        const name = node.name;
        if (GENERIC_PATTERNS.test(name)) {
          context.report({
            node,
            messageId: 'genericName',
            data: {
              name,
              example: SUGGESTIONS[name.toLowerCase()] || 'descriptive-name',
            },
          });
        }
      },
    };
  },
};

Configuration:

{
  "rules": {
    "@custom/enforce-semantic-naming": "warn"
  }
}

Phase 5: Document in CLAUDE.md

Add to root CLAUDE.md:

## Semantic Naming Conventions

All resources follow semantic naming for LLM discoverability:

### MCP Servers
- Format: {service}-{environment}-{type}
- Examples: supabase-prod-database, linear-projects-api

### Files  
- Format: {domain}-{purpose}-{type}.ext
- Examples: user-authentication-config.ts

### Functions
- Format: {action}{Object}(params)
- Examples: getUserById(), validateUserInput()

### Why: 
Semantic names enable LLM natural language search:
- "production database" → finds supabase-prod-database
- "validate user" → finds validateUserInput()

Real-World Examples

Example 1: Database Connection

Scenario: User asks “connect to the staging database”

Before (generic naming):

mcp-servers:
  - db-1
  - db-2

LLM behavior:

  1. Searches for “staging database”
  2. Finds db-1 and db-2
  3. Cannot determine which is staging
  4. Asks user: “Which database is staging: db-1 or db-2?”

After (semantic naming):

mcp-servers:
  - supabase-staging-database
  - supabase-prod-database

LLM behavior:

  1. Searches for “staging database”
  2. Finds supabase-staging-database
  3. Connects immediately without asking

Result: Task completed in 1 step instead of 2-3.

Example 2: Function Discovery

Scenario: User asks “validate the user input before saving”

Before:

function check(x) { /* ... */ }
function process(y) { /* ... */ }  
function validate(z) { /* ... */ }

LLM behavior:

  1. Searches for “validate user input”
  2. Finds generic validate() function
  3. Reads function body to understand purpose
  4. Unsure if this validates user input or something else

After:

function validateUserInput(input: unknown) { /* ... */ }
function validatePaymentAmount(amount: number) { /* ... */ }
function validateEmailFormat(email: string) { /* ... */ }

LLM behavior:

  1. Searches for “validate user input”
  2. Finds validateUserInput()
  3. Confident match – uses it immediately

Result: Correct function selected without reading implementation.

Example 3: Environment Variables

Scenario: User asks “what’s the API rate limit for production?”

Before:

LIMIT=100
MAX=50  
THRESHOLD=1000

LLM behavior:

  1. Searches for “rate limit”
  2. Finds LIMIT, MAX, THRESHOLD
  3. Cannot determine which is the API rate limit
  4. Must read code or ask user

After:

API_RATE_LIMIT_PER_SECOND=100
API_MAX_CONCURRENT_REQUESTS=50
ALERT_ERROR_THRESHOLD=1000

LLM behavior:

  1. Searches for “API rate limit”
  2. Finds API_RATE_LIMIT_PER_SECOND
  3. Returns answer immediately

Result: Instant answer without context switching.

Benefits

1. Higher Retrieval Success Rate

Measured improvement:

  • Before: 50% success rate with generic names
  • After: 90%+ success rate with semantic names
  • Impact: 40% increase in successful retrievals

2. Faster Task Completion

Time savings:

  • No clarification questions: Saves 30-60 seconds per task
  • No documentation reading: Saves 1-3 minutes per lookup
  • Confident selection: Reduces errors and retries

Cumulative: 30-40% faster task completion.

3. Reduced Errors

Semantic names prevent misidentification:

Example:

# Generic (dangerous)
db-1  # Is this prod or staging?
db-2  # Which one should I use?

# Semantic (safe)  
supabase-prod-database  # Clear: production
supabase-staging-database  # Clear: staging

Error reduction:

  • 60% fewer “wrong resource” errors
  • 80% fewer “wrong environment” errors

4. Better Onboarding

New developers understand resources by name alone:

# Self-explanatory
supabase-prod-database
stripe-payment-api  
linear-projects-api
redis-session-cache

No need to ask “What is db-1?” or read docs.

5. Self-Documenting Codebase

Code becomes readable without comments:

// Before: needs comments
const db = connect(); // connects to prod DB
const result = process(data); // validates user input  

// After: self-documenting
const prodDatabase = connectSupabaseDatabase("production");
const validatedUserData = validateAndSanitizeUserInput(data);

Best Practices

1. Use Full Words, Not Abbreviations

Bad:

const usr = getUsr(id);
const pwd = validatePwd(input);
const cfg = loadCfg();

Good:

const user = getUserById(id);
const password = validatePassword(input);  
const config = loadApplicationConfig();

Why: Abbreviations are ambiguous. Is usr “user” or “username”? Full words are unambiguous.

2. Include Environment in Resource Names

Bad:

supabase-database  # Which environment?

Good:

supabase-prod-database
supabase-staging-database

Why: Prevents accidental production access during development.

3. Use Action Verbs for Functions

Bad:

function user(id) { }       // Get? Create? Update?
function payment(data) { }  // Process? Validate? Refund?

Good:

function getUserById(id) { }
function processPayment(data) { }
function refundPayment(paymentId) { }

Why: Verbs communicate intent clearly.

4. Match Natural Language Queries

Think about how users will ask for resources:

User query: “Connect to the production database”

Mismatch: db-prod-1 (user won’t say “db prod 1”)

Match: supabase-prod-database (matches “production database”)

5. Consistent Patterns Across Codebase

Pick a pattern and apply it everywhere:

# Consistent pattern: {service}-{env}-{type}
supabase-prod-database
supabase-staging-database
stripe-prod-api
stripe-staging-api  
linear-prod-api
linear-staging-api

Why: Predictability improves discoverability.

6. Avoid Numbers in Names

Bad:

db-1, db-2, api-1, api-2

Good:

supabase-prod-database
supabase-staging-database

Why: Numbers provide no semantic meaning.

Common Pitfalls

Pitfall 1: Over-Abbreviation

Problem:

const usrAuthCfg = loadCfg();
const pwdValUtil = validate();

Solution:

const userAuthenticationConfig = loadConfig();
const passwordValidationUtil = validatePassword();

Lesson: Save keystrokes elsewhere, not in names.

Pitfall 2: Ambiguous Generics

Problem:

function handle(request) { }  // Handle how?
function process(data) { }     // Process what?

Solution:

function handleWebhookRequest(request) { }
function processPaymentData(data) { }

Lesson: Add context to generic verbs.

Pitfall 3: Mixing Patterns

Problem:

# Inconsistent patterns
supabase_prod  # underscore
stripeProdApi  # camelCase  
linear-prod    # kebab-case

Solution:

# Consistent kebab-case
supabase-prod-database
stripe-prod-api
linear-prod-api

Lesson: Pick one pattern (kebab-case recommended) and stick to it.

Pitfall 4: Not Including Environment

Problem:

supabase-database  # Is this prod or staging?

Solution:

supabase-prod-database
supabase-staging-database

Lesson: Always include environment to prevent errors.

Integration with Other Patterns

Combine with Hierarchical CLAUDE.md

Document naming conventions in CLAUDE.md:

# CLAUDE.md

## Naming Conventions

Semantic naming for LLM discoverability:

### MCP Resources: {service}-{env}-{type}
- supabase-prod-database  
- linear-projects-api

### Files: {domain}-{purpose}-{type}.ts  
- user-authentication-config.ts
- payment-processing-service.ts

Benefit: LLM learns conventions from context.

Combine with Custom ESLint Rules

Enforce semantic naming automatically:

// ESLint rule enforces conventions
"@custom/enforce-semantic-naming": "error"

Benefit: Prevents regression to generic names.

Combine with Knowledge Graph Retrieval

Semantic names become nodes in knowledge graph:

supabase-prod-database
  ├─ environment: production  
  ├─ service: supabase
  └─ type: database

Benefit: Structured metadata for advanced queries.

Measuring Success

Key Metrics

  1. Retrieval Success Rate

    Success Rate = (Successful Retrievals / Total Queries) × 100
    Target: >90%
    
  2. Clarification Questions

    Questions per Task = Clarifications / Completed Tasks
    Target: <0.5 questions/task
    
  3. Task Completion Time

    Avg Time = Total Time / Completed Tasks  
    Target: 30-40% reduction
    
  4. Error Rate

    Error Rate = (Wrong Resource Uses / Total Uses) × 100
    Target: <5%
    

Tracking Dashboard

interface NamingMetrics {
  retrievalSuccessRate: number;  // %
  avgClarificationsPerTask: number;
  avgTaskCompletionTime: number; // seconds
  errorRate: number;             // %
}

const beforeSemanticNaming: NamingMetrics = {
  retrievalSuccessRate: 50,
  avgClarificationsPerTask: 3.2,
  avgTaskCompletionTime: 180,
  errorRate: 15,
};

const afterSemanticNaming: NamingMetrics = {
  retrievalSuccessRate: 92,      // +84% improvement
  avgClarificationsPerTask: 0.4, // -87% reduction  
  avgTaskCompletionTime: 110,    // -39% faster
  errorRate: 3,                  // -80% fewer errors
};

Conclusion

Semantic naming transforms LLM effectiveness by making resources discoverable.

Key takeaways:

  1. Name resources how humans think: Match natural language queries
  2. Use consistent patterns: {service}-{environment}-{type}
  3. Avoid generic names: No more “utils” or “handler”
  4. Include context: Environment, domain, purpose
  5. Self-documenting: Code becomes readable without comments

The result: 90%+ retrieval success, 40% faster task completion, and significantly fewer errors.

Start today:

  1. Audit your MCP resources
  2. Refactor generic names to semantic patterns
  3. Document conventions in CLAUDE.md
  4. Enforce with linting
  5. Measure improvement

Remember: Every generic name is a missed opportunity for LLM discovery. Semantic naming is an investment that compounds—the more resources you name semantically, the more powerful your LLM interactions become.

Related Concepts

References

Topics
Code OrganizationContext ManagementDeveloper ExperienceDiscoverabilityFile NamingFunction NamingGrep OptimizationLlm RetrievalRetrievalSearchability

More Insights

Cover Image for Thought Leaders

Thought Leaders

People to follow for compound engineering, context engineering, and AI agent development.

James Phoenix
James Phoenix
Cover Image for Systems Thinking & Observability

Systems Thinking & Observability

Software should be treated as a measurable dynamical system, not as a collection of features.

James Phoenix
James Phoenix