Leveraging Large Language Models to Extract and Translate Medical Information in Doctors' Notes for Health Records and Diagnostic Billing Codes
Peter Hartnett, Chung-Chi Huang, Sarah Hartnett, and David Hartnett

TL;DR
This study evaluates open-weight large language models for extracting and translating medical notes into diagnostic codes on local hardware, highlighting current limitations and proposing a human-in-the-loop approach for practical use.
Contribution
It introduces a reproducible local LLM pipeline and benchmark dataset for privacy-preserving medical coding, assessing various prompting strategies and model performances.
Findings
High formatting compliance with JSON schema enforcement
Small models struggle with accurate diagnostic code generation
Few-shot prompting can degrade performance due to overfitting and hallucinations
Abstract
Physician burnout in the United States has reached critical levels, driven in part by the administrative burden of Electronic Health Record (EHR) documentation and complex diagnostic codes. To relieve this strain and maintain strict patient privacy, this thesis explores an on-device, offline automatic medical coding system. The work focuses on using open-weight Large Language Models (LLMs) to extract clinical information from physician notes and translate it into ICD-10-CM diagnostic codes without reliance on cloud-based services. A privacy-focused pipeline was developed using Ollama, LangChain, and containerized environments to evaluate multiple open-weight models, including Llama 3.2, Mistral, Phi, and DeepSeek, on consumer-grade hardware. Model performance was assessed for zero-shot, few-shot, and retrieval-augmented generation (RAG) prompting strategies using a novel benchmark of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Electronic Health Records Systems · Medical Coding and Health Information
