Hallucinations Live in Variance
Aaron R. Flouro, Shawn P. Chadwick

TL;DR
This paper introduces the concept of hallucinations as variance-driven unreliability in AI models, proposing Semantic Stability and Paraphrase Consistency as diagnostic tools to measure and understand this phenomenon.
Contribution
It formalizes hallucinations as variance in model outputs, introduces Semantic Stability and Paraphrase Consistency metrics, and analyzes how model sparsity affects reliability and stability.
Findings
Dense Qwen3-0.6B agrees with itself only 23.8% of the time
At 32% sparsity, agreement increases to 55.9%
A phase diagram identifies regimes where variance reduction improves stability
Abstract
Benchmarks measure whether a model is correct. They do not measure whether a model is reliable. This distinction is largely academic for single-shot inference, but becomes critical for agentic AI systems, where a single rephrased prompt can trigger cascading failures in multi-step execution. Yet this form of instability is not captured by existing evaluations. Hallucinations live in variance: they arise when semantically equivalent prompts activate inconsistent internal pathways, producing divergent outputs. Consistent but incorrect outputs reflect bias or missing knowledge; confident guessing reflects calibration failure. Neither constitutes hallucination under this definition. When error is variance-dominated, reducing redundant pathways improves reliability without adding knowledge. We formalize this through Semantic Stability (SS), measured via Paraphrase Consistency (PC@k):…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Scientific Computing and Data Management
