CC-VQA: Conflict- and Correlation-Aware Method for Mitigating Knowledge Conflict in Knowledge-Based Visual Question Answering
Yuyang Hong, Jiaqi Gu, Yujin Lou, Lubin Fan, Qi Yang, Ying Wang, Kun Ding, Yue Wu, Shiming Xiang, Jieping Ye

TL;DR
CC-VQA introduces a training-free, conflict- and correlation-aware approach for knowledge-based visual question answering, effectively addressing visual and knowledge conflicts to improve accuracy on multiple benchmarks.
Contribution
The paper presents CC-VQA, a novel method that incorporates visual conflict reasoning and correlation-guided encoding to mitigate knowledge conflicts in KB-VQA without additional training.
Findings
Achieves state-of-the-art accuracy improvements of 3.3% to 6.4% on benchmarks.
Effectively analyzes visual-semantic conflicts across knowledge contexts.
Enhances conflict mitigation by correlation-aware encoding and adaptive decoding.
Abstract
Knowledge-based visual question answering (KB-VQA) demonstrates significant potential for handling knowledge-intensive tasks. However, conflicts arise between static parametric knowledge in vision language models (VLMs) and dynamically retrieved information due to the static model knowledge from pre-training. The outputs either ignore retrieved contexts or exhibit inconsistent integration with parametric knowledge, posing substantial challenges for KB-VQA. Current knowledge conflict mitigation methods primarily adapted from language-based approaches, focusing on context-level conflicts through engineered prompting strategies or context-aware decoding mechanisms. However, these methods neglect the critical role of visual information in conflicts and suffer from redundant retrieved contexts, which impair accurate conflict identification and effective mitigation. To address these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
