Context window

By Paul Brock·Updated on 22-04-2026

TL;DR

The context window is the maximum amount of text (tokens) an LLM can process at once — its 'memory' size for one interaction.

The context window determines what an LLM can 'see' at once. GPT-3 (2020): 2,048 tokens. GPT-4 (2023): 8k-128k. In 2024-2026, 200k tokens (Claude Sonnet) to 1M tokens (Claude Opus, Gemini Pro) are standard. Larger windows mean entire codebases or books at once, but cost per query scales with length.

Example

Claude Opus 1M context lets a developer supply 500,000 lines of codebase + docs and ask 'analyse where memory leaks might be'. Impossible with 8k windows from 2023.

Frequently asked questions

Is bigger always better?

Not necessarily. Long contexts can cause 'lost in the middle' where models skip middle content. Costs rise proportionally too. Balance is key.

Context window

Example

Frequently asked questions

Related terms

Further reading

Need help with SEO or GEO?