Read, Attend, and Code: Pushing the Limits of Medical Codes Prediction from Clinical Notes by Machines
Byung-Hak Kim, Varun Ganapathi

TL;DR
This paper introduces RAC, a novel machine learning model that significantly improves automated medical code prediction from clinical notes, surpassing current benchmarks and approaching human coder performance, advancing toward fully autonomous medical coding.
Contribution
The paper presents RAC, a new model combining convolutional embeddings, self-attention, and code-guided attention, achieving state-of-the-art results in medical code prediction from clinical notes.
Findings
RAC outperforms previous models with an 18.7% increase in Macro-F1 score.
RAC surpasses human coder performance on the same dataset.
The model establishes a new benchmark for automated medical coding.
Abstract
Prediction of medical codes from clinical notes is both a practical and essential need for every healthcare delivery organization within current medical systems. Automating annotation will save significant time and excessive effort spent by human coders today. However, the biggest challenge is directly identifying appropriate medical codes out of several thousands of high-dimensional codes from unstructured free-text clinical notes. In the past three years, with Convolutional Neural Networks (CNN) and Long Short-Term Memory (LTSM) networks, there have been vast improvements in tackling the most challenging benchmark of the MIMIC-III-full-label inpatient clinical notes dataset. This progress raises the fundamental question of how far automated machine learning (ML) systems are from human coders' working performance. We assessed the baseline of human coders' performance on the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Medical Coding and Health Information · Biomedical Text Mining and Ontologies
MethodsLinear Layer · 1-Dimensional Convolutional Neural Networks · Multi-Head Attention · Softmax · Attention Is All You Need · Stochastic Weight Averaging
