Bi-MCQ: Reformulating Vision-Language Alignment for Negation Understanding

Tae Hun Kim; Hyun Gyu Lee

arXiv:2601.22696·cs.CV·February 2, 2026

Bi-MCQ: Reformulating Vision-Language Alignment for Negation Understanding

Tae Hun Kim, Hyun Gyu Lee

PDF

Open Access

TL;DR

This paper introduces Bi-MCQ, a novel framework that reformulates vision-language alignment as a conditional semantic comparison task, significantly improving negation understanding in medical image analysis models.

Contribution

It proposes a bi-directional multiple-choice learning approach with direction-specific modules to enhance negation comprehension in vision-language models.

Findings

01

Up to 0.47 AUC improvement over state-of-the-art models.

02

Reduces affirmative-negative AUC gap by 0.12 on average.

03

Enhances negation understanding in medical VLMs.

Abstract

Recent vision-language models (VLMs) achieve strong zero-shot performance via large-scale image-text pretraining and have been widely adopted in medical image analysis. However, existing VLMs remain notably weak at understanding negated clinical statements, largely due to contrastive alignment objectives that treat negation as a minor linguistic variation rather than a meaning-inverting operator. In multi-label settings, prompt-based InfoNCE fine-tuning further reinforces easy-positive image-prompt alignments, limiting effective learning of disease absence. To overcome these limitations, we reformulate vision-language alignment as a conditional semantic comparison problem, which is instantiated through a bi-directional multiple-choice learning framework(Bi-MCQ). By jointly training Image-to-Text and Text-to-Image MCQ tasks with affirmative, negative, and mixed prompts, our method…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Machine Learning in Healthcare