CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs
Zhehan Kan, Ce Zhang, Zihan Liao, Yapeng Tian, Wenming Yang, Junyuan, Xiao, Xu Li, Dongmei Jiang, Yaowei Wang, Qingmin Liao

TL;DR
CATCH is a novel decoding method designed to reduce hallucinations in LVLMs by separating visual information, detecting hallucinations, and adaptively correcting token-level outputs, thereby improving reliability in critical applications.
Contribution
The paper introduces CATCH, a comprehensive approach combining visual decoupling, hallucination detection, and adaptive decoding to mitigate hallucinations in LVLMs without additional training.
Findings
Effective hallucination reduction in LVLMs across multiple tasks.
Robust generalization to new tasks without extra training.
Applicable without specific data or prior knowledge.
Abstract
Large Vision-Language Model (LVLM) systems have demonstrated impressive vision-language reasoning capabilities but suffer from pervasive and severe hallucination issues, posing significant risks in critical domains such as healthcare and autonomous systems. Despite previous efforts to mitigate hallucinations, a persistent issue remains: visual defect from vision-language misalignment, creating a bottleneck in visual processing capacity. To address this challenge, we develop Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs (CATCH), based on the Information Bottleneck theory. CATCH introduces Complementary Visual Decoupling (CVD) for visual information separation, Non-Visual Screening (NVS) for hallucination detection, and Adaptive Token-level Contrastive Decoding (ATCD) for hallucination mitigation. CATCH addresses issues related to visual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeurological disorders and treatments · Topological and Geometric Data Analysis · Digital Image Processing Techniques
