PheW2P2V: a phenome-wide prediction framework with weighted patient representations using electronic health records
Jia Guo, Krzysztof Kiryluk, Shuang Wang

TL;DR
PheW2P2V is a new framework that uses electronic health records to predict many disease phenotypes efficiently and accurately.
Contribution
PheW2P2V introduces a phenome-wide prediction framework using weighted patient vectors to improve prediction accuracy and reduce overfitting.
Findings
PheW2P2V achieved a median AUC-ROC of 0.74 across 942 phenome-wide predictions in MIMIC-III.
The framework outperformed baseline methods in max F1-score and AUC-PR metrics.
PheW2P2V leverages both labeled and unlabeled data to improve predictions for rare phenotypes.
Abstract
Electronic health records (EHRs) provide opportunities for the development of computable predictive tools. Conventional machine learning methods and deep learning methods have been widely used for this task, with the approach of usually designing one tool for one clinical outcome. Here we developed PheW2P2V, a Phenome-Wide prediction framework using Weighted Patient Vectors. PheW2P2V conducts tailored predictions for phenome-wide phenotypes using numeric representations of patients’ past medical records weighted based on their similarities with individual phenotypes. PheW2P2V defines clinical disease phenotypes using Phecode mapping based on International Classification of Disease codes, which reduces redundancy and case-control misclassification in real-life EHR datasets. Through upweighting medical records of patients that are more relevant to a phenotype of interest in calculating…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Biomedical Text Mining and Ontologies
