RareCollab: an LLM-powered framework for multimodal reasoning in Mendelian disease diagnosis

Guantong Qi; Jiasheng Wang; Mei Ling Chong; Zahid Shaik; Shenglan Li; Shinya Yamamoto; Maura R.Z. Ruzhnikov; Devon E. Bonner; Jennefer N. Carter; Kevin S. Smith; Matthew T. Wheeler; Stephen B. Montgomery; Jonathan A. Bernstein; Sasidhar Pasupuleti; Undiagnosed Diseases Network; Pengfei Liu; Hu Chen; Zhandong Liu

arXiv:2602.04058·q-bio.GN·April 28, 2026

RareCollab: an LLM-powered framework for multimodal reasoning in Mendelian disease diagnosis

Guantong Qi, Jiasheng Wang, Mei Ling Chong, Zahid Shaik, Shenglan Li, Shinya Yamamoto, Maura R.Z. Ruzhnikov, Devon E. Bonner, Jennefer N. Carter, Kevin S. Smith, Matthew T. Wheeler, Stephen B. Montgomery, Jonathan A. Bernstein, Sasidhar Pasupuleti, Undiagnosed Diseases Network

PDF

TL;DR

RareCollab is an LLM-powered framework that integrates genomic, phenotypic, and transcriptomic evidence for improved Mendelian disease diagnosis, outperforming existing methods in a large real-world benchmark.

Contribution

It introduces RareCollab, a novel multimodal reasoning framework that leverages large language models as interpretable modules for rare disease diagnosis.

Findings

01

RareCollab prioritized 94% of diagnostic genes within top 10.

02

It outperformed proprietary phenotype-driven LLMs by over 25% on average.

03

RNA evidence contributed to 35% of diagnostic gene prioritizations.

Abstract

Rare disease diagnosis increasingly relies on integrating genomic, phenotypic and transcriptomic evidence, yet these signals remain difficult to reconcile within a common interpretive framework. Here we present RareCollab, an LLM-powered framework for multimodal reasoning in Mendelian disease diagnosis that integrates more than 100 diagnostic evidence signals across DNA, RNA, phenotype, curated variant-level knowledge, and in-silico pathogenicity evidence. This design enables large language models to operate as calibrated, interpretable reasoning modules rather than as a single end-to-end ranker. We applied RareCollab to 890 patients from three cohorts, including 119 Undiagnosed Diseases Network probands with paired DNA and RNA data, constituting a large systematic benchmark for multimodal rare disease diagnosis under paired genomic and transcriptomic evaluation. In this real-world…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.