R2MED: A Benchmark for Reasoning-Driven Medical Retrieval
Xiangxu Zhang, Lei Li, Xiao Zhou, Zheng Liu

TL;DR
R2MED is a new benchmark designed to evaluate reasoning-driven medical retrieval, revealing current models' limitations in handling complex clinical information needs.
Contribution
It introduces the first benchmark focused on reasoning-based medical retrieval, covering diverse scenarios and highlighting the gap in existing system capabilities.
Findings
Best models achieve only 31.4 nDCG@10, indicating high difficulty.
Classical re-ranking and generation methods offer limited improvements.
Large reasoning models improve performance but still fall short of clinical reasoning demands.
Abstract
Current medical retrieval benchmarks primarily emphasize lexical or shallow semantic similarity, overlooking the reasoning-intensive demands that are central to clinical decision-making. In practice, physicians often retrieve authoritative medical evidence to support diagnostic hypotheses. Such evidence typically aligns with an inferred diagnosis rather than the surface form of a patient's symptoms, leading to low lexical or semantic overlap between queries and relevant documents. To address this gap, we introduce R2MED, the first benchmark explicitly designed for reasoning-driven medical retrieval. It comprises 876 queries spanning three tasks: Q&A reference retrieval, clinical evidence retrieval, and clinical case retrieval. These tasks are drawn from five representative medical scenarios and twelve body systems, capturing the complexity and diversity of real-world medical information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- R2MED/Biologydataset· 225 dl225 dl
- R2MED/Bioinformaticsdataset· 214 dl214 dl
- R2MED/Medical-Sciencesdataset· 290 dl290 dl
- R2MED/MedXpertQA-Examdataset· 251 dl251 dl
- R2MED/MedQA-Diagdataset· 212 dl212 dl
- R2MED/PMC-Treatmentdataset· 158 dl158 dl
- R2MED/PMC-Clinicaldataset· 365 dl365 dl
- R2MED/IIYi-Clinicaldataset· 258 dl258 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
