Technique Updated 2026-04
Test-time Compute
Definition
Test-time compute allocates more computation at inference time to improve response quality, instead of just scaling up the model.
See also in the glossary
A
AI Inference
Inference is the process of using a trained AI model to generate predictions or responses from new data.
A
AI Reasoning
AI reasoning refers to a model's ability to break down a problem into logical steps to reach a conclusion, rather than answering instinctively.
C
Chain of Thought
Chain of Thought is a prompting technique that asks the model to show its step-by-step reasoning before giving its final answer.
L
LLM (Large Language Model)
An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.
Tools that use test-time compute
Frequently Asked Questions
Why is test-time compute important?
Instead of making the model bigger (expensive to train), you make it think longer (expensive only when needed). That's the principle behind OpenAI o1 and DeepSeek R1.
How does it work?
The model generates multiple reasoning paths, evaluates them, and picks the best. More compute time means better answers.