Modèle Updated 2026-04
Attention Mechanism
Definition
The attention mechanism allows a model to weigh the importance of each word relative to all others, capturing global context.
See also in the glossary
T
Transformer
The Transformer is the neural network architecture powering all modern LLMs, invented by Google in 2017.
L
LLM (Large Language Model)
An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.
D
Deep Learning
Deep Learning is a subset of Machine Learning using multi-layered neural networks to learn complex representations from raw data.
C
Context Window
The context window is the maximum amount of text an LLM can process in a single request.
Tools that use attention mechanism
Frequently Asked Questions
What does 'Attention is All You Need' mean?
It's the title of Google's 2017 paper that introduced the Transformer. It showed that the attention mechanism alone was sufficient, without recurrent networks.
Does attention have a cost?
Yes. Classic attention has quadratic cost: doubling text length quadruples computation. That's why context windows have limits.