Natural language processing of MIMIC-III clinical notes for identifying diagnosis and procedures with neural networks
Siddhartha Nuthakki, Sunil Neela, Judy W. Gichoya, Saptarshi, Purkayastha

TL;DR
This study demonstrates that deep learning models, specifically ULMFiT, can effectively map unstructured clinical notes to medical codes, achieving high accuracy in predicting diagnoses and procedures from MIMIC-III data.
Contribution
The paper introduces the application of ULMFiT to large-scale clinical notes for accurate diagnosis and procedure prediction, outperforming traditional models.
Findings
Achieved over 80% accuracy for top-10 diagnoses and procedures.
Predicted top-50 codes with around 64-71% accuracy.
Potential to assist human coders and reduce healthcare costs.
Abstract
Coding diagnosis and procedures in medical records is a crucial process in the healthcare industry, which includes the creation of accurate billings, receiving reimbursements from payers, and creating standardized patient care records. In the United States, Billing and Insurance related activities cost around $471 billion in 2012 which constitutes about 25% of all the U.S hospital spending. In this paper, we report the performance of a natural language processing model that can map clinical notes to medical codes, and predict final diagnosis from unstructured entries of history of present illness, symptoms at the time of admission, etc. Previous studies have demonstrated that deep learning models perform better at such mapping when compared to conventional machine learning models. Therefore, we employed state-of-the-art deep learning method, ULMFiT on the largest emergency department…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Medical Coding and Health Information · Biomedical Text Mining and Ontologies
MethodsDropout · Sigmoid Activation · Tanh Activation · Temporal Activation Regularization · DropConnect · Long Short-Term Memory · Activation Regularization · Discriminative Fine-Tuning · Embedding Dropout · Variational Dropout
