A Simple Meta-learning Paradigm for Zero-shot Intent Classification with Mixture Attention Mechanism
Han Liu, Siyang Zhao, Xiaotong Zhang, Feng Zhang, Junjie Sun, Hong Yu,, Xianchao Zhang

TL;DR
This paper introduces a simple meta-learning framework combined with a novel mixture attention mechanism to improve zero-shot intent classification, effectively enhancing utterance feature extraction and model generalization across unseen intents.
Contribution
It proposes a new mixture attention mechanism for better semantic representation and a meta-learning strategy to improve transferability to unseen classes.
Findings
Outperforms strong baselines on real-world datasets
Effective in both standard and generalized zero-shot tasks
Enhances utterance feature extraction and model generalization
Abstract
Zero-shot intent classification is a vital and challenging task in dialogue systems, which aims to deal with numerous fast-emerging unacquainted intents without annotated training data. To obtain more satisfactory performance, the crucial points lie in two aspects: extracting better utterance features and strengthening the model generalization ability. In this paper, we propose a simple yet effective meta-learning paradigm for zero-shot intent classification. To learn better semantic representations for utterances, we introduce a new mixture attention mechanism, which encodes the pertinent word occurrence patterns by leveraging the distributional signature attention and multi-layer perceptron attention simultaneously. To strengthen the transfer ability of the model from seen classes to unseen classes, we reformulate zero-shot intent classification with a meta-learning strategy, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
