Detecting and Mitigating Hallucination in Large Vision Language Models   via Fine-Grained AI Feedback

Wenyi Xiao; Ziwei Huang; Leilei Gan; Wanggui He; Haoyuan Li; Zhelun; Yu; Fangxun Shu; Hao Jiang; Linchao Zhu

arXiv:2404.14233·cs.CV·January 7, 2025·2 cites

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback

Wenyi Xiao, Ziwei Huang, Leilei Gan, Wanggui He, Haoyuan Li, Zhelun, Yu, Fangxun Shu, Hao Jiang, Linchao Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a fine-grained AI feedback approach to detect and reduce hallucinations in large vision language models, improving alignment with given contexts through a novel detect-then-rewrite pipeline and severity-aware optimization.

Contribution

It presents a new sentence-level hallucination detection model and a severity-aware preference optimization method for better hallucination mitigation in LVLMs.

Findings

01

Effective hallucination detection at sentence level

02

Improved hallucination mitigation with severity-aware optimization

03

Enhanced model alignment with context in experiments

Abstract

The rapidly developing Large Vision Language Models (LVLMs) have shown notable capabilities on a range of multi-modal tasks, but still face the hallucination phenomena where the generated texts do not align with the given contexts, significantly restricting the usages of LVLMs. Most previous work detects and mitigates hallucination at the coarse-grained level or requires expensive annotation (e.g., labeling by proprietary models or human experts). To address these issues, we propose detecting and mitigating hallucinations in LVLMs via fine-grained AI feedback. The basic idea is that we generate a small-size sentence-level hallucination annotation dataset by proprietary models, whereby we train a hallucination detection model which can perform sentence-level hallucination detection, covering primary hallucination types (i.e., object, attribute, and relationship). Then, we propose a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Mr-Loevan/HSA-DPO
noneOfficial

Videos

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback· underline

Taxonomy

TopicsBig Data and Digital Economy · Anomaly Detection Techniques and Applications

MethodsALIGN