Providers & requests

Input tokens

Also called: prompt tokens

Input tokens are the tokens you send in a request: the system prompt, the conversation history, loaded files, and tool definitions. You are billed for them, and they count against the context window.

James Phoenix
Understanding Data Updated July 2, 2026

Every model provider request has two sides. Input tokens are the tokens you send in: the system prompt, the whole conversation so far, any files the agent has read, the tool definitions, and your latest message. They are also called prompt tokens. The provider adds them all up, and that total is what the model reads before it writes a single word back.

What counts as input

It is easy to underestimate this, because most input is not typed by you. On a real coding request the input is dominated by:

  • The system prompt and every tool definition, sent on every request.
  • The full history of the session, which only grows.
  • File contents and command output the agent has pulled in.

Your actual instruction is often the smallest part.

Why input tokens deserve attention

Two reasons, one about cost and one about quality:

  • You pay for them. Input tokens are billed, usually at a lower rate than output tokens, but there are far more of them, so they often dominate the bill on a long session.
  • They fill the [context window](/ai-coding-dictionary/context-window). Input is what consumes the window, and a window packed with marginally relevant input both costs more and dilutes the model's attention.
Tip
If a session feels expensive or sluggish, look at what you are re-sending. Because the model is stateless, the entire history rides along on every request. Clearing finished work is the most direct way to cut input tokens.

Related terms

Building with AI agents?

This dictionary is part of how I think about agentic engineering. If you want the same thinking applied to your codebase, that is what I do.

See how I can help