Evolving Knowledge Distillation with Large Language Models and Active   Learning

Chengyuan Liu; Yangyang Kang; Fubang Zhao; Kun Kuang; Zhuoren Jiang,; Changlong Sun; Fei Wu

arXiv:2403.06414·cs.CL·March 12, 2024·2 cites

Evolving Knowledge Distillation with Large Language Models and Active Learning

Chengyuan Liu, Yangyang Kang, Fubang Zhao, Kun Kuang, Zhuoren Jiang,, Changlong Sun, Fei Wu

PDF

Open Access

TL;DR

EvoKD introduces an active learning-based approach to knowledge distillation from large language models, iteratively improving small models' performance by analyzing weaknesses and generating targeted, challenging training samples.

Contribution

The paper presents a novel active learning framework for knowledge distillation that actively analyzes student weaknesses and guides LLMs to generate more effective training data.

Findings

01

EvoKD improves small model performance on NLP tasks

02

Active analysis leads to more diverse and challenging training samples

03

Method outperforms traditional distillation approaches

Abstract

Large language models (LLMs) have demonstrated remarkable capabilities across various NLP tasks. However, their computational costs are prohibitively high. To address this issue, previous research has attempted to distill the knowledge of LLMs into smaller models by generating annotated data. Nonetheless, these works have mainly focused on the direct use of LLMs for text generation and labeling, without fully exploring their potential to comprehend the target task and acquire valuable knowledge. In this paper, we propose EvoKD: Evolving Knowledge Distillation, which leverages the concept of active learning to interactively enhance the process of data generation using large language models, simultaneously improving the task capabilities of small domain model (student model). Different from previous work, we actively analyze the student model's weaknesses, and then synthesize labeled…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsKnowledge Distillation