CURE: A Multimodal Benchmark for Clinical Understanding and Retrieval Evaluation
Yannian Gu, Zhongzhen Huang, Linjie Mu, Xizhuo Zhang, Shaoting Zhang, Xiaofan Zhang

TL;DR
The paper introduces CURE, a benchmark for evaluating multimodal clinical reasoning and retrieval, revealing significant gaps in current models' ability to independently retrieve and utilize medical evidence.
Contribution
CURE provides a novel, controlled benchmark for disentangling reasoning and retrieval in multimodal clinical AI models, facilitating targeted improvements.
Findings
Models perform well with provided evidence (up to 73.4% accuracy).
Retrieval-based performance drops significantly (as low as 25.4%).
Highlights challenges in evidence retrieval and integration in clinical AI.
Abstract
Multimodal large language models (MLLMs) demonstrate considerable potential in clinical diagnostics, a domain that inherently requires synthesizing complex visual and textual data alongside consulting authoritative medical literature. However, existing benchmarks primarily evaluate MLLMs in end-to-end answering scenarios. This limits the ability to disentangle a model's foundational multimodal reasoning from its proficiency in evidence retrieval and application. We introduce the Clinical Understanding and Retrieval Evaluation (CURE) benchmark. Comprising multimodal clinical cases mapped to physician-cited reference literature, CURE evaluates reasoning and retrieval under controlled evidence settings to disentangle their respective contributions. We evaluate state-of-the-art MLLMs across distinct evidence-gathering paradigms in both closed-ended and open-ended diagnosis tasks.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Machine Learning in Healthcare
