Supervised multi-specialist topic model with applications on large-scale electronic health record data
Ziyang Song, Xavier Sumba Toral, Yixin Xu, Aihua Liu, Liming Guo,, Guido Powell, Aman Verma, David Buckeridge, Ariane Marelli, Yue Li

TL;DR
This paper introduces MixEHR-S, a novel supervised hierarchical Bayesian topic model for EHR data that jointly infers disease and specialist topics, enabling accurate disease prediction and patient risk stratification.
Contribution
The paper presents a new unified model for EHR data that captures disease and specialist topics simultaneously, with an efficient inference algorithm, applied successfully to large-scale datasets.
Findings
Achieved superior prediction accuracy over existing methods.
Identified clinically meaningful latent topics.
Demonstrated effectiveness in three disease prediction tasks.
Abstract
Motivation: Electronic health record (EHR) data provides a new venue to elucidate disease comorbidities and latent phenotypes for precision medicine. To fully exploit its potential, a realistic data generative process of the EHR data needs to be modelled. We present MixEHR-S to jointly infer specialist-disease topics from the EHR data. As the key contribution, we model the specialist assignments and ICD-coded diagnoses as the latent topics based on patient's underlying disease topic mixture in a novel unified supervised hierarchical Bayesian topic model. For efficient inference, we developed a closed-form collapsed variational inference algorithm to learn the model distributions of MixEHR-S. We applied MixEHR-S to two independent large-scale EHR databases in Quebec with three targeted applications: (1) Congenital Heart Disease (CHD) diagnostic prediction among 154,775 patients; (2)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare
MethodsVariational Inference
