Technique Updated 2026-04

Model Distillation

Definition

Distillation transfers knowledge from a large model (teacher) to a smaller model (student), preserving performance at lower cost.

See also in the glossary

LLM (Large Language Model)

An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.

SLM (Small Language Model)

An SLM is a compact language model optimized to run on local devices with targeted performance on specific tasks.

Fine-tuning is the process of retraining an existing AI model on a specific dataset to adapt it to a particular domain or task.

Quantization reduces the precision of numbers in an AI model to make it smaller and faster, with minimal quality loss.

Tools that use model distillation

The open source Chinese model rivaling GPT-4

Mistral Le Chat

The sovereign European AI, GDPR-compliant

The open source AI agent that turns your LLMs into autonomous workers

Cloud IDE with built-in AI for coding from anywhere

Frequently Asked Questions

Why distill instead of fine-tune?

Fine-tuning adapts an existing model. Distillation creates a new smaller model that mimics a larger one. The result is faster and cheaper to run.

Does DeepSeek use distillation?

Yes. DeepSeek used distillation to create compact high-performing models, contributing to their unbeatable value ratio.