A variational Bayes latent class approach for EHR-based patient phenotyping in R
Brian Buckley, Adrian O'Hagan, Marie Galligan

TL;DR
This paper introduces VBphenoR, an R package that employs a variational Bayes Gaussian Mixture Model and logistic regression to identify patient phenotypes from EHR data, enabling improved phenotyping and biomarker analysis.
Contribution
The paper presents a novel R package implementing a closed-form variational Bayes approach for patient phenotyping using EHR data, combining GMM and logistic regression.
Findings
Effective identification of patient phenotypes from EHR data.
Quantitative assessment of biomarker shifts associated with phenotypes.
Demonstrated predictive performance of clinical and medication codes.
Abstract
The VBphenoR package for R provides a closed-form variational Bayes approach to patient phenotyping using Electronic Health Records (EHR) data. We implement a variational Bayes Gaussian Mixture Model (GMM) algorithm using closed-form coordinate ascent variational inference (CAVI) to determine the patient phenotype latent class. We then implement a variational Bayes logistic regression, where we determine the probability of the phenotype in the supplied EHR cohort, the shift in biomarkers for patients with the phenotype of interest versus a healthy population and evaluate predictive performance of binary indicator clinical codes and medication codes. The logistic model likelihood applies the latent class from the GMM step to inform the conditional.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Bayesian Methods and Mixture Models · Statistical Methods and Inference
