Unsupervised EHR-based Phenotyping via Matrix and Tensor Decompositions
Florian Becker, Age K. Smilde, Evrim Acar

TL;DR
This paper reviews unsupervised methods for discovering patient subgroups from EHR data using matrix and tensor decompositions, emphasizing interpretability, handling data challenges, and temporal phenotyping.
Contribution
It provides a comprehensive categorization and analysis of low-rank approximation methods for EHR phenotyping, including validation approaches.
Findings
Categorizes approaches into temporal and static phenotyping
Highlights interpretability and data challenge solutions
Discusses validation of clinical significance
Abstract
Computational phenotyping allows for unsupervised discovery of subgroups of patients as well as corresponding co-occurring medical conditions from electronic health records (EHR). Typically, EHR data contains demographic information, diagnoses and laboratory results. Discovering (novel) phenotypes has the potential to be of prognostic and therapeutic value. Providing medical practitioners with transparent and interpretable results is an important requirement and an essential part for advancing precision medicine. Low-rank data approximation methods such as matrix (e.g., non-negative matrix factorization) and tensor decompositions (e.g., CANDECOMP/PARAFAC) have demonstrated that they can provide such transparent and interpretable insights. Recent developments have adapted low-rank data approximation methods by incorporating different constraints and regularizations that facilitate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTensor decomposition and applications · Machine Learning in Healthcare
