LA-MARRVEL: A Knowledge-Grounded, Language-Aware LLM Framework for Clinically Robust Rare Disease Gene Prioritization
Jaeyeon Lee, Lin Yao, Hyun-Hwan Jeong, and Zhandong Liu

TL;DR
LA-MARRVEL is a knowledge-grounded, language-aware LLM framework that significantly improves rare disease gene prioritization accuracy by effectively encoding phenotypic context and providing interpretable, clinically relevant reasoning.
Contribution
It introduces a novel LLM-based approach with structured, phenotype-rich prompts that enhance gene ranking accuracy and clinical interpretability in rare disease diagnosis.
Findings
12-15 percentage-point improvement in Recall@1
Consistent performance across three real-world cohorts
Provides ACMG-aligned, interpretable reasoning for each candidate gene
Abstract
Rare disease diagnosis requires matching variant-bearing genes to complex patient phenotypes across large and heterogeneous evidence sources. This process remains time-intensive in current clinical interpretation pipelines. To overcome these limitations, We present LA-MARRVEL, a knowledge-grounded, language-aware LLM framework and designed for clinical robustness and practical deployment. LA-MARRVEL delivers a 12-15 percentage-point absolute improvement in Recall@1 over established gene prioritization approaches, showing that architectural design can drive substantial accuracy gains. We found that the central contributor is structured, phenotype-rich prompt construction that explicitly encodes patient and disease phenotypes, preserving clinically meaningful context more effectively than disease labels alone. Across three real-world cohorts, LA-MARRVEL consistently improves gene-ranking…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Biomedical Text Mining and Ontologies · Bioinformatics and Genomic Networks
