Minimizing PLM-Based Few-Shot Intent Detectors
Haode Zhang, Albert Y.S. Lam, Xiao-Ming Wu

TL;DR
This paper presents techniques to significantly reduce the size of PLM-based few-shot intent detectors, making them suitable for resource-constrained environments without sacrificing performance.
Contribution
It introduces a combination of data augmentation, knowledge distillation, and vocabulary pruning to compress intent detectors by 21 times while preserving accuracy.
Findings
Achieved 21x model size reduction
Maintained performance on four real-world benchmarks
Combined multiple compression techniques effectively
Abstract
Recent research has demonstrated the feasibility of training efficient intent detectors based on pre-trained language model~(PLM) with limited labeled data. However, deploying these detectors in resource-constrained environments such as mobile devices poses challenges due to their large sizes. In this work, we aim to address this issue by exploring techniques to minimize the size of PLM-based intent detectors trained with few-shot data. Specifically, we utilize large language models (LLMs) for data augmentation, employ a cutting-edge model compression method for knowledge distillation, and devise a vocabulary pruning mechanism called V-Prune. Through these approaches, we successfully achieve a compression ratio of 21 in model memory usage, including both Transformer and the vocabulary, while maintaining almost identical performance levels on four real-world benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiation Detection and Scintillator Technologies · Particle Detector Development and Performance · Radiation Effects in Electronics
MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Linear Layer · Label Smoothing · Adam · Dropout · Multi-Head Attention · Dense Connections · Softmax
