Multi-stage Retrieve and Re-rank Model for Automatic Medical Coding Recommendation
Xindi Wang, Robert E. Mercer, Frank Rudzicz

TL;DR
This paper introduces a multi-stage retrieve and re-rank framework for automatic medical coding, combining hybrid retrieval and contrastive re-ranking to improve accuracy in ICD code assignment from electronic health records.
Contribution
The paper proposes a novel multi-stage framework with hybrid retrieval and contrastive re-ranking, achieving state-of-the-art results in ICD coding from EHR data.
Findings
Achieves state-of-the-art performance on MIMIC-III benchmark.
Effectively handles large, long-tailed ICD label sets.
Improves accuracy through label co-occurrence guided re-ranking.
Abstract
The International Classification of Diseases (ICD) serves as a definitive medical classification system encompassing a wide range of diseases and conditions. The primary objective of ICD indexing is to allocate a subset of ICD codes to a medical record, which facilitates standardized documentation and management of various health conditions. Most existing approaches have suffered from selecting the proper label subsets from an extremely large ICD collection with a heavy long-tailed label distribution. In this paper, we leverage a multi-stage ``retrieve and re-rank'' framework as a novel solution to ICD indexing, via a hybrid discrete retrieval method, and re-rank retrieved candidates with contrastive learning that allows the model to make more accurate predictions from a simplified label space. The retrieval model is a hybrid of auxiliary knowledge of the electronic health records (EHR)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAI in cancer detection · Image Retrieval and Classification Techniques · Cognitive Computing and Networks
MethodsContrastive Learning
