Unsupervised Learning for Computational Phenotyping

Chris Hodapp

arXiv:1612.08425·stat.ML·December 30, 2016·1 cites

Unsupervised Learning for Computational Phenotyping

Chris Hodapp

PDF

Open Access 1 Repo

TL;DR

This paper presents an open-source unsupervised learning method for computational phenotyping using electronic health records, enabling discovery of hidden patterns without requiring labeled data, and generalizes it to laboratory time-series data.

Contribution

It adapts and scales an existing unsupervised phenotyping method to Apache Spark and Python, and extends it to laboratory time-series data in MIMIC-III.

Findings

01

Open-source tool available for exploration and visualization

02

Generalizes phenotyping to laboratory time-series data

03

Facilitates discovery of hidden clinical patterns

Abstract

With large volumes of health care data comes the research area of computational phenotyping, making use of techniques such as machine learning to describe illnesses and other clinical concepts from the data itself. The "traditional" approach of using supervised learning relies on a domain expert, and has two main limitations: requiring skilled humans to supply correct labels limits its scalability and accuracy, and relying on existing clinical descriptions limits the sorts of patterns that can be found. For instance, it may fail to acknowledge that a disease treated as a single condition may really have several subtypes with different phenotypes, as seems to be the case with asthma and heart disease. Some recent papers cite successes instead using unsupervised learning. This shows great potential for finding patterns in Electronic Health Records that would otherwise be hidden and that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Hodapp87/mimic3_phenotyping
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Time Series Analysis and Forecasting · Gaussian Processes and Bayesian Inference