TL;DR
This paper introduces DSG-KD, a knowledge distillation method that transfers domain-specific knowledge from specialized models to general language models, improving performance on non-English medical classification tasks.
Contribution
The study proposes a novel knowledge distillation approach to enhance general language models with domain-specific knowledge, especially for non-English medical data.
Findings
Outperforms baseline models on Korean PED EMR data
Effectively transfers domain knowledge via distillation
Improves classification accuracy in non-English contexts
Abstract
The use of pre-trained language models fine-tuned to address specific downstream tasks is a common approach in natural language processing (NLP). However, acquiring domain-specific knowledge via fine-tuning is challenging. Traditional methods involve pretraining language models using vast amounts of domain-specific data before fine-tuning for particular tasks. This study investigates emergency/non-emergency classification tasks based on electronic medical record (EMR) data obtained from pediatric emergency departments (PEDs) in Korea. Our findings reveal that existing domain-specific pre-trained language models underperform compared to general language models in handling N-lingual free-text data characteristics of non-English-speaking regions. To address these limitations, we propose a domain knowledge transfer methodology that leverages knowledge distillation to infuse general language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsKnowledge Distillation
