TL;DR
This paper introduces a multilingual hallucination span detection method using LLM-uncertainty, leveraging entropy analysis of stochastically sampled responses to identify hallucinated content without additional training.
Contribution
The proposed approach is cost-effective, training-free, and utilizes entropy-based analysis of response variability to detect hallucinations in multilingual LLM outputs.
Findings
Effective hallucination span detection across multiple languages.
Entropy-based divergence correlates with hallucinated content.
Method is adaptable and requires no additional training.
Abstract
Identification of hallucination spans in black-box language model generated text is essential for applications in the real world. A recent attempt at this direction is SemEval-2025 Task 3, Mu-SHROOM-a Multilingual Shared Task on Hallucinations and Related Observable Over-generation Errors. In this work, we present our solution to this problem, which capitalizes on the variability of stochastically-sampled responses in order to identify hallucinated spans. Our hypothesis is that if a language model is certain of a fact, its sampled responses will be uniform, while hallucinated facts will yield different and conflicting results. We measure this divergence through entropy-based analysis, allowing for accurate identification of hallucinated segments. Our method is not dependent on additional training and hence is cost-effective and adaptable. In addition, we conduct extensive hyperparameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
