# Clinical Concept Extraction for Document-Level Coding

**Authors:** Sarah Wiegreffe, Edward Choi, Sherry Yan, Jimeng Sun, Jacob Eisenstein

arXiv: 1906.03380 · 2019-06-11

## TL;DR

This paper investigates combining concept extraction with machine learning for clinical document coding, but finds that concepts do not improve coding performance, highlighting challenges and future directions.

## Contribution

It introduces two novel methods to integrate concept extraction with document-level coding, bridging information extraction and supervised learning approaches.

## Key findings

- Concepts did not improve coding performance.
- Exploration of reasons behind the lack of gains.
- Discussion of future research directions.

## Abstract

The text of clinical notes can be a valuable source of patient information and clinical assessments. Historically, the primary approach for exploiting clinical notes has been information extraction: linking spans of text to concepts in a detailed domain ontology. However, recent work has demonstrated the potential of supervised machine learning to extract document-level codes directly from the raw text of clinical notes. We propose to bridge the gap between the two approaches with two novel syntheses: (1) treating extracted concepts as features, which are used to supplement or replace the text of the note; (2) treating extracted concepts as labels, which are used to learn a better representation of the text. Unfortunately, the resulting concepts do not yield performance gains on the document-level clinical coding task. We explore possible explanations and future research directions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.03380/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1906.03380/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1906.03380/full.md

---
Source: https://tomesphere.com/paper/1906.03380