HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data
Qifan Yu, Juncheng Li, Longhui Wei, Liang Pang, Wentao Ye, Bosheng, Qin, Siliang Tang, Qi Tian, Yueting Zhuang

TL;DR
This paper introduces HalluciDoctor, a framework that detects and eliminates hallucinations in machine-generated visual instruction data, significantly reducing hallucinations and improving the robustness of multi-modal models.
Contribution
The paper presents a novel cross-checking based hallucination detection framework and a counterfactual data augmentation method to mitigate hallucinations in large-scale visual instruction datasets.
Findings
Reduces hallucinations by 44.6% relative
Balances data distribution to improve model robustness
Maintains competitive performance with existing models
Abstract
Multi-modal Large Language Models (MLLMs) tuned on machine-generated instruction-following data have demonstrated remarkable performance in various multi-modal understanding and generation tasks. However, the hallucinations inherent in machine-generated data, which could lead to hallucinatory outputs in MLLMs, remain under-explored. This work aims to investigate various hallucinations (i.e., object, relation, attribute hallucinations) and mitigate those hallucinatory toxicities in large-scale machine-generated visual instruction datasets. Drawing on the human ability to identify factual errors, we present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm. We use our framework to identify and eliminate hallucinations in the training data automatically. Interestingly, HalluciDoctor also indicates that spurious correlations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPsychedelics and Drug Studies · Data Visualization and Analytics · Image and Video Quality Assessment
