A Two-Stage Decoder for Efficient ICD Coding
Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler

TL;DR
This paper introduces a two-stage decoding approach for ICD coding that mimics human coding by first predicting general categories and then specific subcategories, improving efficiency and accuracy.
Contribution
The paper proposes a novel hierarchical two-stage decoder for ICD coding that leverages code structure without external data, enhancing performance.
Findings
Effective in single-model setting
No external data or knowledge needed
Improves accuracy over baseline models
Abstract
Clinical notes in healthcare facilities are tagged with the International Classification of Diseases (ICD) code; a list of classification codes for medical diagnoses and procedures. ICD coding is a challenging multilabel text classification problem due to noisy clinical document inputs and long-tailed label distribution. Recent automated ICD coding efforts improve performance by encoding medical notes and codes with additional data and knowledge bases. However, most of them do not reflect how human coders generate the code: first, the coders select general code categories and then look for specific subcategories that are relevant to a patient's condition. Inspired by this, we propose a two-stage decoding mechanism to predict ICD codes. Our model uses the hierarchical properties of the codes to split the prediction into two steps: At first, we predict the parent code and then predict the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression
