Context window
The context window is the maximum amount of text (tokens) an LLM can process at once — its 'memory' size for one interaction.
The context window determines what an LLM can 'see' at once. GPT-3 (2020): 2,048 tokens. GPT-4 (2023): 8k-128k. In 2024-2026, 200k tokens (Claude Sonnet) to 1M tokens (Claude Opus, Gemini Pro) are standard. Larger windows mean entire codebases or books at once, but cost per query scales with length.
Example
Claude Opus 1M context lets a developer supply 500,000 lines of codebase + docs and ask 'analyse where memory leaks might be'. Impossible with 8k windows from 2023.
Frequently asked questions
Is bigger always better?
Not necessarily. Long contexts can cause 'lost in the middle' where models skip middle content. Costs rise proportionally too. Balance is key.
Related terms
Further reading
- → Our service: AI sector