Technique Updated 2026-04

AI Inference

Definition

Inference is the process of using a trained AI model to generate predictions or responses from new data.

Frequently Asked Questions

What's the difference between training and inference?
Training creates the model (expensive, done once). Inference uses the model to respond (cheaper, per request). When you ask ChatGPT a question, that's inference.
Why does inference cost money?
Each request requires GPU computation. The longer the response and larger the model, the more expensive. That's why APIs charge per token.