CLASH: A Benchmark for Cross-Modal Contradiction Detection

Teodora Popordanoska; Jiameng Li; Matthew B. Blaschko

arXiv:2511.19199·cs.CV·November 25, 2025

CLASH: A Benchmark for Cross-Modal Contradiction Detection

Teodora Popordanoska, Jiameng Li, Matthew B. Blaschko

PDF

Open Access

TL;DR

CLASH is a new benchmark designed to evaluate and improve models' ability to detect contradictions between images and captions, addressing a critical gap in multimodal understanding and reliability.

Contribution

We introduce CLASH, the first comprehensive benchmark for cross-modal contradiction detection, including a large dataset with controlled contradictions and evaluation protocols.

Findings

01

State-of-the-art models struggle with contradiction detection.

02

Fine-tuning on CLASH improves model performance significantly.

03

Models exhibit modality biases and category-specific weaknesses.

Abstract

Contradictory multimodal inputs are common in real-world settings, yet existing benchmarks typically assume input consistency and fail to evaluate cross-modal contradiction detection - a fundamental capability for preventing hallucinations and ensuring reliability. We introduce CLASH, a novel benchmark for multimodal contradiction detection, featuring COCO images paired with contradictory captions containing controlled object-level or attribute-level contradictions. The samples include targeted questions evaluated in both multiple-choice and open-ended formats. The benchmark provides an extensive fine-tuning set filtered through automated quality checks, alongside a smaller human-verified diagnostic set. Our analysis of state-of-the-art models reveals substantial limitations in recognizing cross-modal conflicts, exposing systematic modality biases and category-specific weaknesses.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Adversarial Robustness in Machine Learning