HalLoc: Token-level Localization of Hallucinations for Vision Language Models
Eunkyu Park, Minyeong Kim, Gunhee Kim

TL;DR
HalLoc introduces a large dataset and a low-overhead detection model for token-level hallucination detection in vision-language models, improving reliability and efficiency in real-world applications.
Contribution
The paper presents a new dataset with token-level hallucination annotations and a baseline detection model that can be integrated into existing models for real-time hallucination detection.
Findings
Dataset with 150K token-level annotations across multiple tasks.
Baseline model achieves efficient hallucination detection during generation.
Improves trustworthiness of vision-language models in practical scenarios.
Abstract
Hallucinations pose a significant challenge to the reliability of large vision-language models, making their detection essential for ensuring accuracy in critical applications. Current detection methods often rely on computationally intensive models, leading to high latency and resource demands. Their definitive outcomes also fail to account for real-world scenarios where the line between hallucinated and truthful information is unclear. To address these issues, we propose HalLoc, a dataset designed for efficient, probabilistic hallucination detection. It features 150K token-level annotated samples, including hallucination types, across Visual Question Answering (VQA), instruction-following, and image captioning tasks. This dataset facilitates the development of models that detect hallucinations with graded confidence, enabling more informed user interactions. Additionally, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices
