Training is where a model comes from. You take an untrained network, feed it a vast corpus of text and code, and repeatedly nudge its parameters so its next-token predictions get less wrong. Do that at enormous scale and the model absorbs grammar, facts, coding patterns, and a surprising amount of reasoning ability. The output is the finished set of weights.
A one-time, frozen event
The two things worth remembering about training:
- It is expensive and rare. Training a frontier model costs a fortune in compute and happens on a schedule set by the provider, not by you. You consume the result via inference.
- It freezes knowledge in time. Whatever the model learned is fixed at the moment training ended. It has no awareness of anything that happened afterwards, which is the root of the knowledge-cutoff problem and a common source of confident-but-outdated answers.
Training vs. giving context
This is the single most useful implication for daily work: you do not "train" a coding agent by talking to it. Correcting it in a conversation changes nothing about its parameters. What you are actually doing is adjusting its context, the text in front of it right now. If you want it to know your codebase or your conventions, you supply that as context on each request, because the training door is closed.
Related terms
Model
A model is the trained artifact at the centre of every AI coding tool: a large file of numbers (parameters) that, given some text, produces the most likely continuation. When people say "which model are you using," this is the thing they mean.
Read definition →Inference
Inference is the act of running a trained model to get an answer: text goes in, a prediction comes out. Every message you send to a coding agent is an inference. It is the opposite end of the lifecycle from training.
Read definition →Hallucination
A hallucination is a confident, plausible-sounding output that is simply wrong: an invented API, a fabricated file path, a made-up citation. It is not the model lying. It is the model doing exactly what it always does, predicting plausible text, with no built-in sense of truth.
Read definition →