Input tokens

Every model provider request has two sides. Input tokens are the tokens you send in: the system prompt, the whole conversation so far, any files the agent has read, the tool definitions, and your latest message. They are also called prompt tokens. The provider adds them all up, and that total is what the model reads before it writes a single word back.

What counts as input

It is easy to underestimate this, because most input is not typed by you. On a real coding request the input is dominated by:

The system prompt and every tool definition, sent on every request.
The full history of the session, which only grows.
File contents and command output the agent has pulled in.

Your actual instruction is often the smallest part.

Why input tokens deserve attention

Two reasons, one about cost and one about quality:

You pay for them. Input tokens are billed, usually at a lower rate than output tokens, but there are far more of them, so they often dominate the bill on a long session.
They fill the [context window](/ai-coding-dictionary/context-window). Input is what consumes the window, and a window packed with marginally relevant input both costs more and dilutes the model's attention.

Tip

If a session feels expensive or sluggish, look at what you are re-sending. Because the model is stateless, the entire history rides along on every request. Clearing finished work is the most direct way to cut input tokens.

What counts as input

Why input tokens deserve attention

Related terms

Token

Output tokens

Context window

Building with AI agents?