Memory

Conversation history

Also called: chat history, message history

Conversation history is the running list of past turns you re-send on every request so the model appears to remember. It is the simplest form of memory, and the first thing to overflow a context window if you never prune it.

James Phoenix
Understanding Data Updated July 3, 2026

A model keeps no memory between calls, so a chat that "remembers" is really you re-sending every prior turn each time. Conversation history is that growing list of messages. Append the model's reply and the next user turn, send the whole thing back, and the model can resolve "it" and "that" to what came before.

It is just a growing array

Each turn adds to the list, and the full list goes on the next request:

TypeScript
import { generateText } from 'ai'
import { openai } from '@ai-sdk/openai'

const model = openai('gpt-5-mini')
const messages = [{ role: 'user', content: 'My name is James. Remember it.' }]

const first = await generateText({ model, messages })
messages.push({ role: 'assistant', content: first.text })
messages.push({ role: 'user', content: 'What is my name?' })

const second = await generateText({ model, messages })
// second.text -> "You mentioned that your name is James..."

Because the earlier turn is still in the messages array, the model can answer the follow-up.

The catch

Conversation history works beautifully for short chats and breaks down for long ones. Fifty turns in, you are spending thousands of tokens re-sending old context before the model even reads the new question, and the useful part gets buried (see lost in the middle). That is where smarter memory strategies come in:

  • Summarise the old turns into a compact recap instead of shipping them verbatim.
  • Retrieve only the relevant past turns with embeddings, rather than all of them.
  • Trim anything the current task does not need.
Tip
Full history is the right default until it is not. Watch the token count as a session grows, and switch to summarising or retrieving memory before the window fills and older context starts silently dropping out.

Related terms

Engineering context for real systems?

Getting the right information into the window at the right time is most of the job. If you want that thinking applied to your product, that is what I do.

See how I can help