Comportement Updated 2026-04
Model Collapse
Definition
Model collapse is a phenomenon where an AI model trained on data generated by other AI models progressively loses quality and diversity, converging toward degenerate outputs.
See also in the glossary
L
LLM (Large Language Model)
An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.
G
Generative AI
Generative AI refers to artificial intelligence systems capable of creating original content: text, images, video, audio, code.
O
Overfitting
Overfitting occurs when an AI model has over-learned the training data and fails to generalize to new data.
F
Fine-tuning
Fine-tuning is the process of retraining an existing AI model on a specific dataset to adapt it to a particular domain or task.
D
Deep Learning
Deep Learning is a subset of Machine Learning using multi-layered neural networks to learn complex representations from raw data.
A
AI Benchmark
An AI benchmark is a standardized test that measures and compares AI model performance on specific tasks.
Tools that use model collapse
Frequently Asked Questions
Does model collapse threaten future LLMs?
Yes, it is a real risk. As the web fills with AI-generated text, future models trained on this data may underperform. That is why pre-AI data (before 2022) has become a strategic asset and media publishers negotiate licensing deals with AI labs.
How do AI labs prevent model collapse?
Key strategies include: filtering synthetic data from training datasets, prioritizing verified human data sources, using classifiers to detect AI content, and maintaining pre-AI data archives as reference.