Fine-tuning Pre-trained Language Models for Few-shot Intent Detection:   Supervised Pre-training and Isotropization

Haode Zhang; Haowen Liang; Yuwei Zhang; Liming Zhan; Xiaolei Lu,; Albert Y.S. Lam; Xiao-Ming Wu

arXiv:2205.07208·cs.CL·September 17, 2024

Fine-tuning Pre-trained Language Models for Few-shot Intent Detection: Supervised Pre-training and Isotropization

Haode Zhang, Haowen Liang, Yuwei Zhang, Liming Zhan, Xiaolei Lu,, Albert Y.S. Lam, Xiao-Ming Wu

PDF

Open Access 1 Repo

TL;DR

This paper improves few-shot intent detection by regularizing pre-trained language models towards an isotropic feature space, enhancing their semantic representations and classification performance.

Contribution

It introduces two novel regularizers based on contrastive learning and correlation matrices to isotropize feature space during supervised pre-training.

Findings

01

Isotropization regularizers improve intent detection accuracy.

02

Regularized models outperform baseline pre-trained models.

03

Effective in low-data few-shot scenarios.

Abstract

It is challenging to train a good intent classifier for a task-oriented dialogue system with only a few annotations. Recent studies have shown that fine-tuning pre-trained language models with a small amount of labeled utterances from public benchmarks in a supervised manner is extremely helpful. However, we find that supervised pre-training yields an anisotropic feature space, which may suppress the expressive power of the semantic representations. Inspired by recent research in isotropization, we propose to improve supervised pre-training by regularizing the feature space towards isotropy. We propose two regularizers based on contrastive learning and correlation matrix respectively, and demonstrate their effectiveness through extensive experiments. Our main finding is that it is promising to regularize supervised pre-training with isotropization to further improve the performance of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fanolabs/isointentbert-main
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Speech and dialogue systems

MethodsContrastive Learning