X-PCR: A Benchmark for Cross-modality Progressive Clinical Reasoning in Ophthalmic Diagnosis

Gui Wang; Zehao Zhong; YongSong Zhou; Yudong Li; Ende Wu; Wooi Ping Cheah; Rong Qu; Jianfeng Ren; Linlin Shen

arXiv:2604.20350·cs.CV·April 23, 2026

X-PCR: A Benchmark for Cross-modality Progressive Clinical Reasoning in Ophthalmic Diagnosis

Gui Wang, Zehao Zhong, YongSong Zhou, Yudong Li, Ende Wu, Wooi Ping Cheah, Rong Qu, Jianfeng Ren, Linlin Shen

PDF

1 Repo

TL;DR

X-PCR introduces a comprehensive benchmark for evaluating multi-modal large language models' clinical reasoning in ophthalmology, emphasizing progressive reasoning and cross-modal integration across diverse imaging modalities.

Contribution

It presents the first complete ophthalmology diagnostic workflow benchmark, including reasoning tasks and a large curated dataset for evaluating MLLMs.

Findings

01

Current MLLMs show significant gaps in clinical reasoning capabilities.

02

The benchmark covers 52 ophthalmic diseases with extensive multi-modal data.

03

Evaluation highlights the need for improved models in progressive and cross-modal reasoning.

Abstract

Despite significant progress in Multi-modal Large Language Models (MLLMs), their clinical reasoning capacity for multi-modal diagnosis remains largely unexamined. Current benchmarks, mostly single-modality data, can't evaluate progressive reasoning and cross-modal integration essential for clinical practice. We introduce the Cross-Modality Progressive Clinical Reasoning (X-PCR) benchmark, the first comprehensive evaluation of MLLMs through a complete ophthalmology diagnostic workflow, with two reasoning tasks: 1) a six-stage progressive reasoning chain spanning image quality assessment to clinical decision-making, and 2) a cross-modality reasoning task integrating six imaging modalities. The benchmark comprises 26,415 images and 177,868 expert-verified VQA pairs curated from 51 public datasets, covering 52 ophthalmic diseases. Evaluation of 21 MLLMs reveals critical gaps in progressive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CVI-SZU/X-PCR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.