MIMIC-RD: Can LLMs differentially diagnose rare diseases in real-world clinical settings?
Zilal Eiz AlDin, John Wu, Jeffrey Paul Fung, Jennifer King, Mya Watts, Lauren ONeill, Adam Richard Cross, Jimeng Sun

TL;DR
This paper introduces MIMIC-RD, a new benchmark for evaluating large language models' ability to diagnose rare diseases in real-world clinical settings, revealing current models' limitations.
Contribution
The study presents MIMIC-RD, a novel benchmark constructed from clinical text mapped to Orphanet, and evaluates LLMs' performance, highlighting significant gaps in rare disease diagnosis.
Findings
Current LLMs perform poorly on rare disease diagnosis
Existing benchmarks do not reflect real-world clinical complexity
Future research needed to improve LLM diagnostic capabilities
Abstract
Despite rare diseases affecting 1 in 10 Americans, their differential diagnosis remains challenging. Due to their impressive recall abilities, large language models (LLMs) have been recently explored for differential diagnosis. Existing approaches to evaluating LLM-based rare disease diagnosis suffer from two critical limitations: they rely on idealized clinical case studies that fail to capture real-world clinical complexity, or they use ICD codes as disease labels, which significantly undercounts rare diseases since many lack direct mappings to comprehensive rare disease databases like Orphanet. To address these limitations, we explore MIMIC-RD, a rare disease differential diagnosis benchmark constructed by directly mapping clinical text entities to Orphanet. Our methodology involved an initial LLM-based mining process followed by validation from four medical annotators to confirm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare
