TL;DR
This paper introduces , a framework for testing vision-language models' robustness against knowledge conflicts and hallucinations through image perturbations, revealing vulnerabilities and improving detection via targeted fine-tuning.
Contribution
The study presents a systematic framework for evaluating and enhancing VLM robustness to knowledge conflicts and hallucinations, filling a gap in multimodal model research.
Findings
VLMs are robust to parametric conflicts (~20% adherence)
VLMs struggle with counterfactual identification (<30% accuracy)
Fine-tuning improves conflict detection capabilities
Abstract
Vision language models (VLM) demonstrate sophisticated multimodal reasoning yet are prone to hallucination when confronted with knowledge conflicts, impeding their deployment in information-sensitive contexts. While existing research addresses robustness in unimodal models, the multimodal domain lacks systematic investigation of cross-modal knowledge conflicts. This research introduces \segsub, a framework for applying targeted image perturbations to investigate VLM resilience against knowledge conflicts. Our analysis reveals distinct vulnerability patterns: while VLMs are robust to parametric conflicts (20% adherence rates), they exhibit significant weaknesses in identifying counterfactual conditions (<30% accuracy) and resolving source conflicts (<1% accuracy). Correlations between contextual richness and hallucination rate (r = -0.368, p = 0.003) reveal the kinds of images that are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
