Distilling Large Language Models for Efficient Clinical Information Extraction
Karthik S. Vedula, Annika Gupta, Akshay Swaminathan, Ivan Lopez,, Suhana Bedi, and Nigam H. Shah

TL;DR
This study demonstrates that distilled BERT models can efficiently perform clinical information extraction with comparable accuracy to large language models, while significantly reducing computational costs and inference time.
Contribution
The paper introduces a knowledge distillation approach to create smaller, faster BERT models for clinical NER tasks, achieving similar performance to large LLMs.
Findings
Distilled BERT models achieved high F1 scores close to large LLMs.
Distilled models were up to 101x cheaper and 12x faster.
External validation confirmed robust performance across datasets.
Abstract
Large language models (LLMs) excel at clinical information extraction but their computational demands limit practical deployment. Knowledge distillation--the process of transferring knowledge from larger to smaller models--offers a potential solution. We evaluate the performance of distilled BERT models, which are approximately 1,000 times smaller than modern LLMs, for clinical named entity recognition (NER) tasks. We leveraged state-of-the-art LLMs (Gemini and OpenAI models) and medical ontologies (RxNorm and SNOMED) as teacher labelers for medication, disease, and symptom extraction. We applied our approach to over 3,300 clinical notes spanning five publicly available datasets, comparing distilled BERT models against both their teacher labelers and BERT models fine-tuned on human labels. External validation was conducted using clinical notes from the MedAlign dataset. For disease…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsLayer Normalization · Attention Dropout · Linear Layer · Softmax · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Attention Is All You Need · WordPiece · Dropout
