Technique Aktualisiert 2026-04
AI Inference
Definition
Inference is the process of using a trained AI model to generate predictions or responses from new data.
Siehe auch im Glossar
L
LLM (Large Language Model)
An LLM is an AI model trained on billions of texts, capable of understanding and generating human language.
T
Token
A token is the basic unit processed by an LLM. It's a piece of word, punctuation or character that the model uses to understand and generate text.
A
AI API
An AI API allows developers to integrate artificial intelligence capabilities into their applications.
G
GPU Cloud
GPU Cloud provides on-demand graphics processors for training and running AI models without hardware investment.
Tools, die ai inference verwenden
Häufig gestellte Fragen
What's the difference between training and inference?
Training creates the model (expensive, done once). Inference uses the model to respond (cheaper, per request). When you ask ChatGPT a question, that's inference.
Why does inference cost money?
Each request requires GPU computation. The longer the response and larger the model, the more expensive. That's why APIs charge per token.