A Weakly Supervised Transformer for Rare Disease Diagnosis and Subphenotyping from EHRs with Pulmonary Case Studies
Kimberly F. Greco, Zongxin Yang, Mengyan Li, Han Tong, Sara Morini Sweet, Alon Geva, Kenneth D. Mandl, Benjamin A. Raby, Tianxi Cai

TL;DR
This paper introduces WEST, a weakly supervised transformer model that leverages EHR data and limited expert labels to improve rare disease detection, subphenotyping, and progression prediction with minimal manual annotation.
Contribution
WEST is a novel framework that combines weak supervision and transformer models to enhance rare disease phenotyping from noisy EHR data.
Findings
WEST outperforms existing methods in phenotype classification.
It effectively identifies clinically meaningful subphenotypes.
The model predicts disease progression more accurately.
Abstract
Rare diseases affect an estimated 300-400 million people worldwide, yet individual conditions remain underdiagnosed and poorly characterized due to their low prevalence and limited clinician familiarity. Computational phenotyping offers a scalable approach to improving rare disease detection, but algorithm development is hindered by the scarcity of high-quality labeled data for training. Expert-labeled datasets from chart reviews and registries are clinically accurate but limited in scope and availability, whereas labels derived from electronic health records (EHRs) provide broader coverage but are often noisy or incomplete. To address these challenges, we propose WEST (WEakly Supervised Transformer for rare disease phenotyping and subphenotyping from EHRs), a framework that combines routinely collected EHR data with a limited set of expert-validated cases and controls to enable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLung Cancer Diagnosis and Treatment · Lung Cancer Treatments and Mutations
