TL;DR
This paper analyzes the role of semantic probability in MIL-based hallucination detection for LLMs, proposing a more efficient model that maintains competitive performance without costly semantic similarity computations.
Contribution
It offers a theoretical analysis of decision margins in MIL methods and introduces a lightweight, margin-based approach using max pooling for hallucination detection.
Findings
Theoretical analysis shows semantic consistency enlarges decision margins.
The proposed model achieves efficiency gains over state-of-the-art methods.
Maintains competitive performance without semantic similarity computations.
Abstract
Hallucination detection has become increasingly important for improving the reliability of large language models (LLMs). Recently, hybrid approaches such as HaMI, which combine semantic consistency with internal model states via Multiple Instance Learning (MIL), have achieved state-of-the-art performance. However, these methods incur substantial computational overhead due to repeated sampling and costly semantic similarity computations. In this work, we first provide a theoretical analysis of HaMI in terms of decision margins, revealing that scaling internal states with semantic consistency leads to an enlarged decision margin. Motivated by this insight, we revisit classical sentence classification models from a margin enlargement perspective, aggregating token-level features via max pooling and directly estimating sentence scores using a lightweight MLP. Without requiring semantic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
