Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization
Haode Zhang, Haowen Liang, Yuwei Zhang, Liming Zhan, Xiaolei Lu,, Albert Y.S. Lam, Xiao-Ming Wu

TL;DR
This paper improves few-shot intent detection by regularizing pre-trained language models towards an isotropic feature space, enhancing their semantic representations and classification performance.
Contribution
It introduces two novel regularizers based on contrastive learning and correlation matrices to isotropize feature space during supervised pre-training.
Findings
Isotropization regularizers improve intent detection accuracy.
Regularized models outperform baseline pre-trained models.
Effective in low-data few-shot scenarios.
Abstract
It is challenging to train a good intent classifier for a task-oriented dialogue system with only a few annotations. Recent studies have shown that fine-tuning pre-trained language models with a small amount of labeled utterances from public benchmarks in a supervised manner is extremely helpful. However, we find that supervised pre-training yields an anisotropic feature space, which may suppress the expressive power of the semantic representations. Inspired by recent research in isotropization, we propose to improve supervised pre-training by regularizing the feature space towards isotropy. We propose two regularizers based on contrastive learning and correlation matrix respectively, and demonstrate their effectiveness through extensive experiments. Our main finding is that it is promising to regularize supervised pre-training with isotropization to further improve the performance of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Speech and dialogue systems
MethodsContrastive Learning
