Retrieval & RAG

Chunking

Also called: text splitting

Chunking is splitting a long document into smaller pieces before you embed and retrieve them. The size and overlap of the chunks decide what can be found as a unit, so it quietly makes or breaks a retrieval system.

James Phoenix
Understanding Data Updated July 3, 2026

You cannot retrieve a whole book. Chunking is the step where you cut a long document into smaller pieces so each one can be embedded, stored, and retrieved on its own. It sounds mechanical, but the choices you make here decide what your system can actually find.

The two dials

Every chunker has a size and an overlap. Size sets how much text lives in one retrievable unit; overlap repeats a little text between neighbours so a fact that straddles a boundary is not cut in half. A minimal splitter looks like this:

TypeScript
function chunk(text, size = 500, overlap = 50) {
  const words = text.split(/\s+/).filter(Boolean)
  const out = []
  for (let i = 0; i < words.length; i += size - overlap) {
    out.push(words.slice(i, i + size).join(' '))
  }
  return out
}

Real chunkers split on paragraph or sentence boundaries rather than raw word counts, so a chunk holds a complete thought instead of half of one.

Getting the size right

  • Too small and a chunk loses the context that makes it meaningful; you retrieve a sentence with no idea what it refers to.
  • Too large and each chunk covers several topics, so retrieval gets vague and you waste tokens pulling in irrelevant text.
  • Overlap buys safety at the cost of some duplication. A little overlap stops facts from being sliced across a boundary.
Tip
Chunking came back into fashion even with huge context windows, because feeding a model only the relevant pieces avoids lost-in-the-middle effects and cuts token cost. Retrieve the chunk, not the whole file.

Related terms

Engineering context for real systems?

Getting the right information into the window at the right time is most of the job. If you want that thinking applied to your product, that is what I do.

See how I can help