CXR-ContraBench: Benchmarking Negated-Option Attraction in Medical VLMs

Zhengru Fang; Yanan Ma; Yu Guo; Senkang Hu; Yixian Zhang; Hangcheng Cao; Wenbo Ding; and Yuguang Fang

arXiv:2605.05810·cs.CV·May 8, 2026

CXR-ContraBench: Benchmarking Negated-Option Attraction in Medical VLMs

Zhengru Fang, Yanan Ma, Yu Guo, Senkang Hu, Yixian Zhang, Hangcheng Cao, Wenbo Ding, and Yuguang Fang

PDF

1 Repo

TL;DR

This paper introduces CXR-ContraBench, a benchmark for evaluating and repairing negated-option attraction failures in medical vision-language models, highlighting significant clinical risks and proposing a deterministic fix.

Contribution

The paper presents a new diagnostic benchmark and a repair method for negation errors in medical VLMs, improving clinical reliability without retraining.

Findings

01

Models show substantial failure rates on negation detection in medical VLMs.

02

Chain-of-thought prompting reduces but does not eliminate negation errors.

03

QCCV-Neg repair significantly improves model accuracy on polarity-confused cases.

Abstract

When a chest X-ray shows consolidation but the question asks which finding is present, a medical vision-language model may answer "No consolidation." This is more than an incorrect choice: it is a polarity reversal that emits a clinical statement contradicting the image. We study this failure as negated-option attraction, where a model is drawn to a negated answer option even when it conflicts with both the visual evidence and the question. We introduce CXR-ContraBench (Chest X-Ray Contradiction Benchmark), a diagnostic benchmark spanning internal ReXVQA slices and external OpenI and CheXpert protocols. The benchmark centers on present-finding questions, where selecting "No X" despite visible X creates the main clinical risk, and uses absent-finding questions as secondary tests of whether models copy negated wording. Across CheXpert protocols, the failure is substantial and persistent.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fangzr/cxr-contrabench-code
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.