Customizing Open Source LLMs for Quantitative Medication Attribute Extraction across Heterogeneous EHR Systems

Zhe Fei; Mehmet Yigit Turali; Shreyas Rajesh; Xinyang Dai; Huyen Pham; Pavan Holur; Yuhui Zhu; Larissa Mooney; Yih-Ing Hser; Vwani Roychowdhury

arXiv:2510.21027·cs.AI·October 27, 2025

Customizing Open Source LLMs for Quantitative Medication Attribute Extraction across Heterogeneous EHR Systems

Zhe Fei, Mehmet Yigit Turali, Shreyas Rajesh, Xinyang Dai, Huyen Pham, Pavan Holur, Yuhui Zhu, Larissa Mooney, Yih-Ing Hser, Vwani Roychowdhury

PDF

TL;DR

This paper presents a framework that customizes open source large language models to extract standardized medication attributes from heterogeneous EHR data, enabling consistent analysis of opioid use disorder treatments across multiple clinics.

Contribution

The study introduces a practical pipeline that adapts open source LLMs for extracting MOUD prescription data from diverse EHR systems, improving coverage and accuracy while supporting privacy-preserving deployment.

Findings

01

Qwen2.5-32B achieves 93.4% coverage and 93.0% accuracy

02

MedGemma-27B achieves 93.1% coverage and 92.2% accuracy

03

Error analysis led to targeted fixes for missing data and unit misinterpretations.

Abstract

Harmonizing medication data across Electronic Health Record (EHR) systems is a persistent barrier to monitoring medications for opioid use disorder (MOUD). In heterogeneous EHR systems, key prescription attributes are scattered across differently formatted fields and freetext notes. We present a practical framework that customizes open source large language models (LLMs), including Llama, Qwen, Gemma, and MedGemma, to extract a unified set of MOUD prescription attributes (prescription date, drug name, duration, total quantity, daily quantity, and refills) from heterogeneous, site specific data and compute a standardized metric of medication coverage, \emph{MOUD days}, per patient. Our pipeline processes records directly in a fixed JSON schema, followed by lightweight normalization and cross-field consistency checks. We evaluate the system on prescription level EHR data from five clinics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.