Hallucinations Live in Variance

Aaron R. Flouro; Shawn P. Chadwick

arXiv:2601.07058·cs.LG·January 13, 2026

Hallucinations Live in Variance

Aaron R. Flouro, Shawn P. Chadwick

PDF

Open Access

TL;DR

This paper introduces the concept of hallucinations as variance-driven unreliability in AI models, proposing Semantic Stability and Paraphrase Consistency as diagnostic tools to measure and understand this phenomenon.

Contribution

It formalizes hallucinations as variance in model outputs, introduces Semantic Stability and Paraphrase Consistency metrics, and analyzes how model sparsity affects reliability and stability.

Findings

01

Dense Qwen3-0.6B agrees with itself only 23.8% of the time

02

At 32% sparsity, agreement increases to 55.9%

03

A phase diagram identifies regimes where variance reduction improves stability

Abstract

Benchmarks measure whether a model is correct. They do not measure whether a model is reliable. This distinction is largely academic for single-shot inference, but becomes critical for agentic AI systems, where a single rephrased prompt can trigger cascading failures in multi-step execution. Yet this form of instability is not captured by existing evaluations. Hallucinations live in variance: they arise when semantically equivalent prompts activate inconsistent internal pathways, producing divergent outputs. Consistent but incorrect outputs reflect bias or missing knowledge; confident guessing reflects calibration failure. Neither constitutes hallucination under this definition. When error is variance-dominated, reducing redundant pathways improves reliability without adding knowledge. We formalize this through Semantic Stability (SS), measured via Paraphrase Consistency (PC@k):…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Scientific Computing and Data Management