Practical Design and Benchmarking of Generative AI Applications for Surgical Billing and Coding
John C. Rollman (1), Bruce Rogers (1), Hamed Zaribafzadeh (1), Daniel, Buckland (2), Ursula Rogers (1), Jennifer Gagnon (1), Ozanan Meireles (1),, Lindsay Jennings (3), Jim Bennett (1), Jennifer Nicholson (3), Nandan Lad, (4), Linda Cendales (1), Andreas Seas (4,5,6)

TL;DR
This study develops and benchmarks small, fine-tuned generative AI models for medical billing and coding, demonstrating they can match larger models' accuracy while maintaining privacy and accessibility.
Contribution
Introduces a practical approach for fine-tuning small LLMs for healthcare billing, showing they outperform or match larger models with minimal resources.
Findings
Fine-tuned models achieved up to 72% accuracy in ICD-10 coding.
The models fabricated less than 1% of codes, indicating high reliability.
Small models performed comparably to GPT-4o in accuracy.
Abstract
Background: Healthcare has many manual processes that can benefit from automation and augmentation with Generative Artificial Intelligence (AI), the medical billing and coding process. However, current foundational Large Language Models (LLMs) perform poorly when tasked with generating accurate International Classification of Diseases, 10th edition, Clinical Modification (ICD-10-CM) and Current Procedural Terminology (CPT) codes. Additionally, there are many security and financial challenges in the application of generative AI to healthcare. We present a strategy for developing generative AI tools in healthcare, specifically for medical billing and coding, that balances accuracy, accessibility, and patient privacy. Methods: We fine tune the PHI-3 Mini and PHI-3 Medium LLMs using institutional data and compare the results against the PHI-3 base model, a PHI-3 RAG application, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurgical Simulation and Training
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Layer Normalization · Dense Connections · Linear Warmup With Linear Decay · WordPiece · Attention Dropout · Adam · Residual Connection · Dropout · Softmax
