Performance-Guided LLM Knowledge Distillation for Efficient Text   Classification at Scale

Flavio Di Palo; Prateek Singhi; Bilal Fadlallah

arXiv:2411.05045·cs.CL·November 11, 2024

Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale

Flavio Di Palo, Prateek Singhi, Bilal Fadlallah

PDF

Open Access 1 Video

TL;DR

This paper introduces Performance-Guided Knowledge Distillation (PGKD), a cost-effective method that distills large language models into smaller, efficient models for text classification, significantly reducing inference costs and latency.

Contribution

The paper presents a novel, performance-aware active learning framework for LLM knowledge distillation tailored for multi-class, sparsely annotated datasets, outperforming traditional methods.

Findings

01

PGKD models are up to 130X faster than LLMs.

02

PGKD reduces inference costs by up to 25X.

03

Outperforms traditional BERT-base and other distillation methods.

Abstract

Large Language Models (LLMs) face significant challenges at inference time due to their high computational demands. To address this, we present Performance-Guided Knowledge Distillation (PGKD), a cost-effective and high-throughput solution for production text classification applications. PGKD utilizes teacher-student Knowledge Distillation to distill the knowledge of LLMs into smaller, task-specific models. PGKD establishes an active learning routine between the student model and the LLM; the LLM continuously generates new training data leveraging hard-negative mining, student model validation performance, and early-stopping protocols to inform the data generation. By employing a cyclical, performance-aware approach tailored for highly multi-class, sparsely annotated datasets prevalent in industrial text classification, PGKD effectively addresses training challenges and outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale· underline

Taxonomy

TopicsText and Document Classification Technologies

MethodsKnowledge Distillation