MMXU: A Multi-Modal and Multi-X-ray Understanding Dataset for Disease Progression

Linjie Mu; Zhongzhen Huang; Shengqian Qin; Yakun Zhu; Shaoting Zhang; Xiaofan Zhang

arXiv:2502.11651·cs.CV·May 26, 2025

MMXU: A Multi-Modal and Multi-X-ray Understanding Dataset for Disease Progression

Linjie Mu, Zhongzhen Huang, Shengqian Qin, Yakun Zhu, Shaoting Zhang, Xiaofan Zhang

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces MMXU, a new dataset for medical visual question answering that emphasizes disease progression analysis over time, and proposes a method to improve LVLMs by incorporating historical patient records.

Contribution

The paper presents MMXU, a novel multi-image dataset for MedVQA focusing on disease progression, and introduces MAG, a method that integrates historical records to enhance diagnostic accuracy.

Findings

01

Incorporating historical records improves diagnostic accuracy by at least 20%.

02

Current LVLMs struggle with disease progression tasks on MMXU.

03

Fine-tuning with MAG enhances model performance significantly.

Abstract

Large vision-language models (LVLMs) have shown great promise in medical applications, particularly in visual question answering (MedVQA) and diagnosis from medical images. However, existing datasets and models often fail to consider critical aspects of medical diagnostics, such as the integration of historical records and the analysis of disease progression over time. In this paper, we introduce MMXU (Multimodal and MultiX-ray Understanding), a novel dataset for MedVQA that focuses on identifying changes in specific regions between two patient visits. Unlike previous datasets that primarily address single-image questions, MMXU enables multi-image questions, incorporating both current and historical patient data. We demonstrate the limitations of current LVLMs in identifying disease progression on MMXU-\textit{test}, even those that perform well on traditional benchmarks. To address…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

linjiemu/mmxu
noneOfficial

Datasets

LinjieMu/MMXU
dataset· 31 dl
31 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging