Γthique Updated 2026-04
AI Safety
Definition
AI Safety is the field focused on ensuring AI systems are safe, reliable and don't cause unintended harm.
See also in the glossary
A
AI Alignment
AI alignment aims to ensure an artificial intelligence system acts in accordance with human values and intentions.
A
AI Hallucination
An AI hallucination is a response generated by an AI model that appears plausible but is factually incorrect or fabricated.
R
RLHF (Reinforcement Learning from Human Feedback)
RLHF is a training technique that uses human feedback to align an LLM's behavior with user expectations.
G
Generative AI
Generative AI refers to artificial intelligence systems capable of creating original content: text, images, video, audio, code.
Tools that use ai safety
Frequently Asked Questions
Why is AI Safety important?
LLMs can generate harmful content, be manipulated through prompt injection, or make biased decisions. AI Safety seeks to prevent these risks.
Who works on AI Safety?
Anthropic (Claude's creator) was explicitly founded for AI Safety. OpenAI, Google DeepMind and Meta also have dedicated teams.