Modèle Updated 2026-04
Transformer
Transformer Architecture
Definition
The Transformer is the neural network architecture powering all modern LLMs, invented by Google in 2017.
See also in the glossary
L
LLM (Large Language Model)
An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.
A
Attention Mechanism
The attention mechanism allows a model to weigh the importance of each word relative to all others, capturing global context.
D
Deep Learning
Deep Learning is a subset of Machine Learning using multi-layered neural networks to learn complex representations from raw data.
N
Neural Network
A neural network is a computing model inspired by the human brain, composed of layers of interconnected nodes that process information to learn patterns.
Tools that use transformer
Frequently Asked Questions
Why did the Transformer revolutionize AI?
Thanks to the attention mechanism that processes all words in parallel (not sequentially). This captures long-range relationships in text and enables massive scaling.
Do all LLMs use Transformers?
Yes, in 2026 all major LLMs (GPT, Claude, Gemini, Llama, Mistral) are Transformer-based. Alternatives exist (Mamba, RWKV) but remain niche.