Large language models are good medical coders, if provided with tools

Keith Kwan

arXiv:2407.12849·cs.IR·July 19, 2024·5 cites

Large language models are good medical coders, if provided with tools

Keith Kwan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a two-stage Retrieve-Rank system that significantly improves automated ICD-10-CM medical coding accuracy over traditional large language models, demonstrating near-perfect performance on a medical conditions dataset.

Contribution

The study presents a novel Retrieve-Rank approach that outperforms vanilla LLMs in medical coding accuracy, emphasizing retrieval-based methods for improved precision.

Findings

01

Retrieve-Rank system achieved 100% accuracy

02

Vanilla LLM achieved 6% accuracy

03

Retrieval-based approach outperforms standard LLMs

Abstract

This study presents a novel two-stage Retrieve-Rank system for automated ICD-10-CM medical coding, comparing its performance against a Vanilla Large Language Model (LLM) approach. Evaluating both systems on a dataset of 100 single-term medical conditions, the Retrieve-Rank system achieved 100% accuracy in predicting correct ICD-10-CM codes, significantly outperforming the Vanilla LLM (GPT-3.5-turbo), which achieved only 6% accuracy. Our analysis demonstrates the Retrieve-Rank system's superior precision in handling various medical terms across different specialties. While these results are promising, we acknowledge the limitations of using simplified inputs and the need for further testing on more complex, realistic medical cases. This research contributes to the ongoing effort to improve the efficiency and accuracy of medical coding, highlighting the importance of retrieval-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ainativehealth/goodmedicalcoder
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Medical Coding and Health Information · Biomedical Text Mining and Ontologies