PrecLLM: A Privacy-Preserving Framework for Efficient Clinical Annotation Extraction from Unstructured EHRs using Small-Scale LLMs
Yixiang Qu, Yifan Dai, Shilin Yu, Pradham Tanikella, Malvika Pillai, Walter Chen, Jialiu Xie, Yishan Ren, Duan Wang, Yikai Wang, Sid Sheth, Guanting Chen, Yufeng Liu, Travis Schrank, Trevor Hackman, Didong Li, Di Wu

TL;DR
PrecLLM is a resource-efficient framework that enhances small-scale LLMs for clinical text annotation by incorporating privacy-preserving preprocessing, making it suitable for secure, local deployment in healthcare settings.
Contribution
The paper introduces a novel preprocessing technique combining regex and RAG to improve small LLMs' performance on clinical data within privacy and resource constraints.
Findings
Pre-filtering improves LLM accuracy on EHR tasks.
PrecLLM outperforms fine-tuned LLMs on MIMIC-IV dataset.
Enhanced sensitivity, specificity, and F1 scores achieved.
Abstract
Large Language Models (LLMs) have demonstrated remarkable proficiency in automated text annotation within natural language processing. However, their deployment in clinical settings is severely constrained by strict privacy regulations and the prohibitive computational cost of processing voluminous unstructured Electronic Health Records (EHRs). In this study, we developed a resource-efficient preprocessing technique that can be adopted in existing LLM procedures. This approach is particularly useful for smaller LLMs, which are often more accuracy-challenged, and forms a compact LLM framework optimized for local deployment in computational environments with stringent privacy requirements and restricted access to high-performance GPUs (PrecLLM). The preprocessing step includes both regular expressions (regex) and Retrieval-Augmented Generation (RAG) to extract and highlight key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data Technologies and Applications
