MI-CXR: A Benchmark for Longitudinal Reasoning over Multi-Interval Chest X-rays

Sunghwan Steve Cho; Yunseok Han; and Jaeyoung Do

arXiv:2605.15574·cs.CV·May 18, 2026

MI-CXR: A Benchmark for Longitudinal Reasoning over Multi-Interval Chest X-rays

Sunghwan Steve Cho, Yunseok Han, and Jaeyoung Do

PDF

1 Repo

TL;DR

MI-CXR is a new benchmark designed to evaluate models' ability to perform longitudinal reasoning over multi-visit chest X-ray sequences, highlighting current models' limitations in temporal understanding.

Contribution

The paper introduces MI-CXR, a comprehensive benchmark for multi-interval longitudinal reasoning in chest X-ray analysis, with a detailed evaluation of state-of-the-art models.

Findings

01

Models achieve only 29.3% accuracy, barely above random chance.

02

Models produce plausible local descriptions but struggle with temporal constraints.

03

MI-CXR reveals key limitations of current vision-language models in medical longitudinal reasoning.

Abstract

Longitudinal chest X-ray (CXR) interpretation requires reasoning over disease evolution across multiple patient visits, yet most existing medical VQA benchmarks focus on single images or short-horizon image pairs. We introduce MI-CXR, a benchmark for standardized evaluation of Multi-Interval longitudinal reasoning over multi-visit CXR sequences, without requiring free-form report generation or additional clinical context. MI-CXR comprises five-way multiple-choice questions over five-visit patient timelines and instantiates three complementary task families: Temporal Event Localization, Interval-wise Change Reasoning, and Global Trajectory Summarization, which assess clinically grounded visual reasoning over time. Evaluating 14 state-of-the-art vision-language models (VLMs) shows low overall performance, with an average accuracy of 29.3%, only modestly above random guessing. Using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AIDASLab/MI-CXR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.