Minimizing PLM-Based Few-Shot Intent Detectors

Haode Zhang; Albert Y.S. Lam; Xiao-Ming Wu

arXiv:2407.09943·cs.CL·September 17, 2024

Minimizing PLM-Based Few-Shot Intent Detectors

Haode Zhang, Albert Y.S. Lam, Xiao-Ming Wu

PDF

Open Access 1 Repo

TL;DR

This paper presents techniques to significantly reduce the size of PLM-based few-shot intent detectors, making them suitable for resource-constrained environments without sacrificing performance.

Contribution

It introduces a combination of data augmentation, knowledge distillation, and vocabulary pruning to compress intent detectors by 21 times while preserving accuracy.

Findings

01

Achieved 21x model size reduction

02

Maintained performance on four real-world benchmarks

03

Combined multiple compression techniques effectively

Abstract

Recent research has demonstrated the feasibility of training efficient intent detectors based on pre-trained language model~(PLM) with limited labeled data. However, deploying these detectors in resource-constrained environments such as mobile devices poses challenges due to their large sizes. In this work, we aim to address this issue by exploring techniques to minimize the size of PLM-based intent detectors trained with few-shot data. Specifically, we utilize large language models (LLMs) for data augmentation, employ a cutting-edge model compression method for knowledge distillation, and devise a vocabulary pruning mechanism called V-Prune. Through these approaches, we successfully achieve a compression ratio of 21 in model memory usage, including both Transformer and the vocabulary, while maintaining almost identical performance levels on four real-world benchmarks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hdzhang-code/smallID
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiation Detection and Scintillator Technologies · Particle Detector Development and Performance · Radiation Effects in Electronics

MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Linear Layer · Label Smoothing · Adam · Dropout · Multi-Head Attention · Dense Connections · Softmax