Modèle Updated 2026-04

Attention Mechanism

Definition

The attention mechanism allows a model to weigh the importance of each word relative to all others, capturing global context.

Frequently Asked Questions

What does 'Attention is All You Need' mean?
It's the title of Google's 2017 paper that introduced the Transformer. It showed that the attention mechanism alone was sufficient, without recurrent networks.
Does attention have a cost?
Yes. Classic attention has quadratic cost: doubling text length quadruples computation. That's why context windows have limits.