Hallucination

LLMs don't retrieve facts from a database - they generate text based on learned statistical patterns. Sometimes those patterns produce false statements with the same confident fluency as true ones. The model doesn't "know" it's wrong; it has no truth signal, only token prediction probabilities.

Types of hallucination

Factual hallucination: Inventing specific facts (dates, statistics, citations) that don't exist.
Citation fabrication: Generating plausible-sounding paper titles, DOIs, or URLs that don't exist.
Instruction hallucination: Claiming to have performed an action (reading a file, executing code) that wasn't actually done.
Temporal confusion: Stating present-tense facts based on stale training data.

Why models hallucinate

Training maximizes token prediction accuracy across a diverse corpus. When the model is asked about something rare or absent from training data, it generates statistically plausible text rather than saying "I don't know." The model has no mechanism to distinguish "I'm generating based on solid evidence" from "I'm filling in based on pattern matching."

Mitigating hallucinations

Retrieval-Augmented Generation (RAG): Ground responses in retrieved documents the model can reference, reducing fabrication of unseen facts.
Tool use: Let the model call a search API or database rather than generate facts from memory.
Constrained output: Request JSON with required fields; structural constraints reduce hallucinated content.
Self-consistency: Generate multiple completions and select the most common answer.
Calibrated uncertainty: Prompt the model to express confidence levels and refuse low-confidence claims.

Types of hallucination

Why models hallucinate

Mitigating hallucinations

Related terms

Models relevant to Hallucination