A Deep Learning Architecture for De-identification of Patient Notes:   Implementation and Evaluation

Kaung Khin; Philipp Burckhardt; Rema Padman

arXiv:1810.01570·cs.CL·October 4, 2018·30 cites

A Deep Learning Architecture for De-identification of Patient Notes: Implementation and Evaluation

Kaung Khin, Philipp Burckhardt, Rema Padman

PDF

Open Access

TL;DR

This paper introduces a deep learning model utilizing contextualized embeddings and variational dropout Bi-LSTMs for de-identifying patient notes, achieving state-of-the-art results efficiently without external knowledge sources.

Contribution

The paper presents a novel deep learning architecture that improves de-identification of clinical notes by leveraging recent NLP advances, outperforming existing methods.

Findings

01

Achieves state-of-the-art performance on two datasets.

02

Converges faster than previous models.

03

Does not require dictionaries or external knowledge sources.

Abstract

De-identification is the process of removing 18 protected health information (PHI) from clinical notes in order for the text to be considered not individually identifiable. Recent advances in natural language processing (NLP) has allowed for the use of deep learning techniques for the task of de-identification. In this paper, we present a deep learning architecture that builds on the latest NLP advances by incorporating deep contextualized word embeddings and variational drop out Bi-LSTMs. We test this architecture on two gold standard datasets and show that the architecture achieves state-of-the-art performance on both data sets while also converging faster than other systems without the use of dictionaries or other knowledge sources.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Machine Learning in Healthcare