A Unified Hallucination Mitigation Framework for Large Vision-Language   Models

Yue Chang; Liqiang Jing; Xiaopeng Zhang; Yue Zhang

arXiv:2409.16494·cs.CV·September 26, 2024·2 cites

A Unified Hallucination Mitigation Framework for Large Vision-Language Models

Yue Chang, Liqiang Jing, Xiaopeng Zhang, Yue Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces Dentist, a unified framework for reducing hallucinations in large vision-language models by classifying queries and applying targeted mitigation strategies, significantly improving accuracy on visual question answering tasks.

Contribution

The paper proposes a novel unified approach that classifies queries to effectively mitigate hallucinations in LVLMs, addressing limitations of previous methods.

Findings

01

Achieves over 10% improvement in accuracy on VQA tasks.

02

Effectively classifies queries into perception and reasoning types.

03

Demonstrates significant reduction in hallucinations in LVLM outputs.

Abstract

Hallucination is a common problem for Large Vision-Language Models (LVLMs) with long generations which is difficult to eradicate. The generation with hallucinations is partially inconsistent with the image content. To mitigate hallucination, current studies either focus on the process of model inference or the results of model generation, but the solutions they design sometimes do not deal appropriately with various types of queries and the hallucinations of the generations about these queries. To accurately deal with various hallucinations, we present a unified framework, Dentist, for hallucination mitigation. The core step is to first classify the queries, then perform different processes of hallucination mitigation based on the classification result, just like a dentist first observes the teeth and then makes a plan. In a simple deployment, Dentist can classify queries as perception…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CYandYue/Dentist
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEpilepsy research and treatment · Brain Tumor Detection and Classification

MethodsFocus