The craft of context, defined.
12 plain-English definitions for the techniques that decide what a model sees: retrieval, agent patterns, reliability, evaluation, and the ways long-context systems fail. Every applicable entry comes with a runnable, tested Vercel AI SDK example.
Foundations
What context engineering is, and the raw material it works with: the window, the tokens, the prompt.
Retrieval & RAG
Pulling the right information into the window at the right time, instead of hoping the model already knows it.
Chunking
Chunking is splitting a long document into smaller pieces before you embed and retrieve them. The size and overlap of the chunks decide what can be found as a unit, so it quietly makes or breaks a retrieval system.
Read definition →Embeddings
An embedding turns a piece of text into a list of numbers that captures its meaning, so that similar ideas land near each other. Embeddings are what let you search by meaning instead of by exact keyword.
Read definition →Retrieval-augmented generation (RAG)
RAG is the workhorse pattern of context engineering: retrieve the material relevant to a request, put it in the context, and let the model generate an answer grounded in it rather than guessing from memory.
Read definition →Agent patterns
The shapes an LLM system can take, from fixed workflows to autonomous agents that choose their own path.
Agents vs. workflows
A workflow follows a path you designed in advance; an agent decides its own path at run time by calling tools in a loop toward a goal. Knowing which one you actually need is the first context-engineering decision.
Read definition →Prompt chaining
Prompt chaining breaks a task into a fixed sequence of steps, feeding each step’s output into the next. It is the simplest workflow pattern, and it beats one giant prompt whenever a task has natural stages.
Read definition →Routing
Routing classifies an input and sends it to the handler built for it. It keeps each path specialised and lets you send easy cases to a cheap model and hard cases to an expensive one, without any of the cost of a full agent.
Read definition →Tool use
Tool use lets a model do more than produce text: you expose named actions with typed inputs, and the model calls them to read data, run code, or reach the outside world. It is the bridge from talking to doing.
Read definition →Reliability techniques
Getting consistent, trustworthy output from a stochastic model that will not give the same answer twice.
Evaluation
Measuring whether your system is actually any good, so you can improve it on purpose rather than by vibes.
Failure modes
The predictable ways long-context systems break, so you can see them coming.
Memory
Carrying the right state across turns and sessions without drowning the window in history.