Technique Updated 2026-04

Context Window

Definition

The context window is the maximum amount of text an LLM can process in a single request.

See also in the glossary

A token is the basic unit processed by an LLM. It's a piece of word, punctuation or character that the model uses to understand and generate text.

LLM (Large Language Model)

An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.

RAG (Retrieval-Augmented Generation)

RAG is a technique that connects an LLM to external data sources to generate more accurate and up-to-date answers.

A prompt is the instruction or question you give an AI to get a response. It's the interface between you and the model.

Tools that use context window

The AI that understands nuance, by Anthropic

Google's AI assistant with 1M token context

The world's most used conversational AI assistant

The open source Chinese model rivaling GPT-4

Frequently Asked Questions

Which LLM has the largest context window?

Gemini 2.0: 1M tokens. Claude Opus: 200K tokens. GPT-4o: 128K tokens.

What happens if you exceed it?

The model forgets the beginning or rejects the request. RAG works around this limitation.