RareAlert: Aligning heterogeneous large language model reasoning for early rare disease risk screening
Xi Chen, Hongru Zhou, Huahui Yi, Shiyu Feng, Hanyu Zhou, Tiancheng He, Mingke You, Li Wang, Qiankun Li, Kun Wang, Weili Fu, Kang Li, Jian Li

TL;DR
RareAlert is a novel system that leverages multiple large language models' reasoning, calibrates their signals, and distills this into a single model to improve early rare disease risk screening from routine primary care data.
Contribution
This work introduces RareAlert, a method that aligns and calibrates reasoning from multiple LLMs to enhance rare disease risk prediction in clinical settings.
Findings
RareAlert achieved an AUC of 0.917 on independent test data.
It outperformed all evaluated LLMs and ensemble methods.
The system enables privacy-preserving, scalable screening for rare diseases.
Abstract
Missed and delayed diagnosis remains a major challenge in rare disease care. At the initial clinical encounters, physicians assess rare disease risk using only limited information under high uncertainty. When high-risk patients are not recognised at this stage, targeted diagnostic testing is often not initiated, resulting in missed diagnosis. Existing primary care triage processes are structurally insufficient to reliably identify patients with rare diseases at initial clinical presentation and universal screening is needed to reduce diagnostic delay. Here we present RareAlert, an early screening system which predict patient-level rare disease risk from routinely available primary-visit information. RareAlert integrates reasoning generated by ten LLMs, calibrates and weights these signals using machine learning, and distils the aligned reasoning into a single locally deployable model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Machine Learning in Healthcare · Artificial Intelligence in Healthcare and Education
