Performance of weakly-supervised electronic health record-based phenotyping methods in rare-outcome settings
Yunjing Hong, Jennifer C. Nelson, Brian D. Williamson

TL;DR
This study evaluates the effectiveness of weakly-supervised phenotyping methods using electronic health records in rare-outcome scenarios, highlighting their variable performance and importance of parameter tuning.
Contribution
It provides a comprehensive comparison of three weakly-supervised methods across diverse simulation settings for rare outcomes.
Findings
No single method outperformed others across all metrics.
SureLDA often performed well in simulations.
Performance is highly sensitive to tuning parameters.
Abstract
Accurately identifying patients with specific medical conditions is a key challenge when using clinical data from electronic health records. Our objective was to comprehensively assess when weakly-supervised prediction methods, which use silver-standard labels (proxy measures of the true outcome) rather than gold-standard true labels, perform well in rare-outcome settings like vaccine safety studies. We compared three methods (PheNorm, MAP, and sureLDA) that combine structured features and features derived from clinical text using natural language processing, through an extensive simulation study with data-generating mechanisms ranging from simple to complex, varying outcome rates, and varying degrees of informative silver labels. We also considered using predicted probabilities to design a chart review validation study. No single method dominated the other across all prediction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
