Modeling Diagnostic Label Correlation for Automatic ICD Coding

Shang-Chi Tsai; Chao-Wei Huang; Yun-Nung Chen

arXiv:2106.12800·cs.CL·June 25, 2021

Modeling Diagnostic Label Correlation for Automatic ICD Coding

Shang-Chi Tsai, Chao-Wei Huang, Yun-Nung Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel two-stage framework that models label correlations to improve automatic ICD coding from clinical notes, significantly enhancing prediction accuracy on benchmark datasets.

Contribution

It is the first to learn label set distribution as a reranking module, capturing label dependencies to boost multi-label classification performance in medical coding.

Findings

01

Improved prediction accuracy on MIMIC datasets.

02

Effective modeling of label correlations enhances ICD coding.

03

First application of label set distribution learning in medical coding.

Abstract

Given the clinical notes written in electronic health records (EHRs), it is challenging to predict the diagnostic codes which is formulated as a multi-label classification task. The large set of labels, the hierarchical dependency, and the imbalanced data make this prediction task extremely hard. Most existing work built a binary prediction for each label independently, ignoring the dependencies between labels. To address this problem, we propose a two-stage framework to improve automatic ICD coding by capturing the label correlation. Specifically, we train a label set distribution estimator to rescore the probability of each label set candidate generated by a base predictor. This paper is the first attempt at learning the label set distribution as a reranking module for medical code prediction. In the experiments, our proposed framework is able to improve upon best-performing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MiuLab/ICD-Correlation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Machine Learning in Healthcare · Text and Document Classification Technologies