Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models
Chaoya Jiang, Hongrui Jia, Wei Ye, Mengfan Dong, Haiyang Xu, Ming Yan,, Ji Zhang, Shikun Zhang

TL;DR
Hal-Eval introduces a comprehensive framework for evaluating hallucinations in large vision-language models, including a new event hallucination category, using advanced LLMs to generate detailed assessment data.
Contribution
The paper presents a new taxonomy of hallucinations, especially event hallucinations, and develops a universal evaluation framework combining discriminative and generative methods for LVLMs.
Findings
New taxonomy including event hallucinations
Generation of fine-grained hallucination data using LLMs
A comprehensive benchmark for LVLM hallucination assessment
Abstract
Large Vision Language Models exhibit remarkable capabilities but struggle with hallucinations inconsistencies between images and their descriptions. Previous hallucination evaluation studies on LVLMs have identified hallucinations in terms of objects, attributes, and relations but overlooked complex hallucinations that create an entire narrative around a fictional entity. In this paper, we introduce a refined taxonomy of hallucinations, featuring a new category: Event Hallucination. We then utilize advanced LLMs to generate and filter fine grained hallucinatory data consisting of various types of hallucinations, with a particular focus on event hallucinations, laying the groundwork for integrating discriminative and generative evaluation methods within our universal evaluation framework. The proposed benchmark distinctively assesses LVLMs ability to tackle a broad spectrum of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Anomaly Detection Techniques and Applications
MethodsFocus
