Once Correct, Still Wrong: Counterfactual Hallucination in Multilingual Vision-Language Models

Basel Mousi; Fahim Dalvi; Shammur Chowdhury; Firoj Alam; Nadir Durrani

arXiv:2602.05437·cs.CL·April 22, 2026

Once Correct, Still Wrong: Counterfactual Hallucination in Multilingual Vision-Language Models

Basel Mousi, Fahim Dalvi, Shammur Chowdhury, Firoj Alam, Nadir Durrani

PDF

1 Repo

TL;DR

This paper introduces M$^2$CQA, a multilingual benchmark for evaluating counterfactual hallucination in vision-language models across diverse cultural contexts, revealing significant biases and failure modes.

Contribution

It presents a new culturally grounded benchmark and a metric for measuring counterfactual hallucination, highlighting challenges in multilingual and dialectal settings.

Findings

01

Counterfactual hallucination rates are higher in Arabic dialects.

02

Reasoning-first prompting increases hallucination.

03

Answering before justification improves robustness.

Abstract

Vision-language models (VLMs) can achieve high accuracy while still accepting culturally plausible but visually incorrect interpretations. Existing hallucination benchmarks rarely test this failure mode, particularly outside Western contexts and English. We introduce M $^{2}$ CQA, a culturally grounded multimodal benchmark built from images spanning 17 MENA countries, paired with contrastive true and counterfactual statements in English, Arabic, and its dialects. To isolate hallucination beyond raw accuracy, we propose the CounterFactual Hallucination Rate (CFHR), which measures counterfactual acceptance conditioned on correctly answering the true statement. Evaluating state-of-the-art VLMs under multiple prompting strategies, we find that CFHR rises sharply in Arabic, especially in dialects, even when true-statement accuracy remains high. Moreover, reasoning-first prompting consistently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://huggingface.co/datasets/QCRI/M2CQA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.