Unsupervised patient representations from clinical notes with   interpretable classification decisions

Madhumita Sushil; Simon \v{S}uster; Kim Luyckx; Walter Daelemans

arXiv:1711.05198·cs.CL·November 15, 2017·1 cites

Unsupervised patient representations from clinical notes with interpretable classification decisions

Madhumita Sushil, Simon \v{S}uster, Kim Luyckx, Walter Daelemans

PDF

Open Access

TL;DR

This paper introduces unsupervised methods to generate dense, interpretable patient representations from clinical notes using autoencoders and paragraph vectors, and evaluates their effectiveness in supervised tasks.

Contribution

It presents novel unsupervised patient embedding techniques from clinical notes and explores their interpretability and feature significance in classification tasks.

Findings

01

Dense representations outperform sparse features in supervised tasks

02

Autoencoder features can be interpreted through input feature significance

03

Pretrained representations improve classification performance

Abstract

We have two main contributions in this work: 1. We explore the usage of a stacked denoising autoencoder, and a paragraph vector model to learn task-independent dense patient representations directly from clinical notes. We evaluate these representations by using them as features in multiple supervised setups, and compare their performance with those of sparse representations. 2. To understand and interpret the representations, we explore the best encoded features within the patient representations obtained from the autoencoder model. Further, we calculate the significance of the input features of the trained classifiers when we use these pretrained representations as input.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Colorectal Cancer Screening and Detection · Biomedical Text Mining and Ontologies