Modèle Updated 2026-04
Mixture of Experts (MoE)
Mixture of Experts
Definition
MoE is a model architecture that activates only a fraction of its parameters for each request, making large models more efficient.
See also in the glossary
L
LLM (Large Language Model)
An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.
T
Transformer
The Transformer is the neural network architecture powering all modern LLMs, invented by Google in 2017.
A
AI Inference
Inference is the process of using a trained AI model to generate predictions or responses from new data.
F
Foundation Model
A foundation model is a large AI model pre-trained on massive data, adaptable to multiple tasks.
Tools that use mixture of experts
Frequently Asked Questions
How does MoE work?
The model contains multiple specialized 'experts'. A router decides which experts to activate for each request. Result: a 1T parameter model only uses 100B per request.
Which models use MoE?
GPT-4 (rumored), Mistral's Mixtral (confirmed), Google's Gemini, and DeepSeek V3. MoE has become the dominant architecture for very large models.