A Unified Hallucination Mitigation Framework for Large Vision-Language Models
Yue Chang, Liqiang Jing, Xiaopeng Zhang, Yue Zhang

TL;DR
This paper introduces Dentist, a unified framework for reducing hallucinations in large vision-language models by classifying queries and applying targeted mitigation strategies, significantly improving accuracy on visual question answering tasks.
Contribution
The paper proposes a novel unified approach that classifies queries to effectively mitigate hallucinations in LVLMs, addressing limitations of previous methods.
Findings
Achieves over 10% improvement in accuracy on VQA tasks.
Effectively classifies queries into perception and reasoning types.
Demonstrates significant reduction in hallucinations in LVLM outputs.
Abstract
Hallucination is a common problem for Large Vision-Language Models (LVLMs) with long generations which is difficult to eradicate. The generation with hallucinations is partially inconsistent with the image content. To mitigate hallucination, current studies either focus on the process of model inference or the results of model generation, but the solutions they design sometimes do not deal appropriately with various types of queries and the hallucinations of the generations about these queries. To accurately deal with various hallucinations, we present a unified framework, Dentist, for hallucination mitigation. The core step is to first classify the queries, then perform different processes of hallucination mitigation based on the classification result, just like a dentist first observes the teeth and then makes a plan. In a simple deployment, Dentist can classify queries as perception…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEpilepsy research and treatment · Brain Tumor Detection and Classification
MethodsFocus
