CKD-EHR:Clinical Knowledge Distillation for Electronic Health Records

Junke Wang; Hongshun Ling; Li Zhang; Longqian Zhang; Fang Wang; Yuan Gao; Zhi Li

arXiv:2506.15118·cs.CL·June 19, 2025

CKD-EHR:Clinical Knowledge Distillation for Electronic Health Records

Junke Wang, Hongshun Ling, Li Zhang, Longqian Zhang, Fang Wang, Yuan Gao, Zhi Li

PDF

Open Access

TL;DR

This paper introduces CKD-EHR, a knowledge distillation framework that enhances disease prediction accuracy and efficiency in EHRs by transferring knowledge from a large language model to a lightweight model, improving clinical diagnosis.

Contribution

The study presents a novel knowledge distillation approach using multi-granularity attention to improve EHR-based disease prediction models, addressing medical knowledge representation and deployment efficiency.

Findings

01

9% increase in diagnostic accuracy

02

27% improvement in F1-score

03

22.2x faster inference speed

Abstract

Electronic Health Records (EHR)-based disease prediction models have demonstrated significant clinical value in promoting precision medicine and enabling early intervention. However, existing large language models face two major challenges: insufficient representation of medical knowledge and low efficiency in clinical deployment. To address these challenges, this study proposes the CKD-EHR (Clinical Knowledge Distillation for EHR) framework, which achieves efficient and accurate disease risk prediction through knowledge distillation techniques. Specifically, the large language model Qwen2.5-7B is first fine-tuned on medical knowledge-enhanced data to serve as the teacher model.It then generates interpretable soft labels through a multi-granularity attention distillation mechanism. Finally, the distilled knowledge is transferred to a lightweight BERT student model. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare