Mind the Unseen Mass: Unmasking LLM Hallucinations via Soft-Hybrid Alphabet Estimation
Hongxing Pan, Yingying Guo, Wenqing Kuang, Jiashi Lu

TL;DR
This paper introduces SHADE, a novel estimator for quantifying semantic uncertainty in large language models using limited samples, improving accuracy in low-data scenarios.
Contribution
SHADE combines Good-Turing coverage with spectral graph methods to better estimate semantic alphabet size under small sample constraints.
Findings
SHADE outperforms existing methods in low-sample regimes.
It provides more accurate semantic occupancy estimates with fewer samples.
The approach improves QA incorrectness detection under limited sampling.
Abstract
This paper studies uncertainty quantification for large language models (LLMs) under black-box access, where only a small number of responses can be sampled for each query. In this setting, estimating the effective semantic alphabet size--that is, the number of distinct meanings expressed in the sampled responses--provides a useful proxy for downstream risk. However, frequency-based estimators tend to undercount rare semantic modes when the sample size is small, while graph-spectral quantities alone are not designed to estimate semantic occupancy accurately. To address this issue, we propose SHADE (Soft-Hybrid Alphabet Dynamic Estimator), a simple and interpretable estimator that combines Generalized Good-Turing coverage with a heat-kernel trace of the normalized Laplacian constructed from an entailment-weighted graph over sampled responses. The estimated coverage adaptively determines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
