DSG-KD: Knowledge Distillation from Domain-Specific to General Language   Models

Sangyeon Cho; Jangyeong Jeon; Dongjoon Lee; Changhee Lee; Junyeong Kim

arXiv:2409.14904·cs.CL·September 24, 2024

DSG-KD: Knowledge Distillation from Domain-Specific to General Language Models

Sangyeon Cho, Jangyeong Jeon, Dongjoon Lee, Changhee Lee, Junyeong Kim

PDF

1 Repo

TL;DR

This paper introduces DSG-KD, a knowledge distillation method that transfers domain-specific knowledge from specialized models to general language models, improving performance on non-English medical classification tasks.

Contribution

The study proposes a novel knowledge distillation approach to enhance general language models with domain-specific knowledge, especially for non-English medical data.

Findings

01

Outperforms baseline models on Korean PED EMR data

02

Effectively transfers domain knowledge via distillation

03

Improves classification accuracy in non-English contexts

Abstract

The use of pre-trained language models fine-tuned to address specific downstream tasks is a common approach in natural language processing (NLP). However, acquiring domain-specific knowledge via fine-tuning is challenging. Traditional methods involve pretraining language models using vast amounts of domain-specific data before fine-tuning for particular tasks. This study investigates emergency/non-emergency classification tasks based on electronic medical record (EMR) data obtained from pediatric emergency departments (PEDs) in Korea. Our findings reveal that existing domain-specific pre-trained language models underperform compared to general language models in handling N-lingual free-text data characteristics of non-English-speaking regions. To address these limitations, we propose a domain knowledge transfer methodology that leverages knowledge distillation to infuse general language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

josangyeon/dsg-kd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKnowledge Distillation