Label Semantic Aware Pre-training for Few-shot Text Classification
Aaron Mueller, Jason Krone, Salvatore Romeo, Saab Mansour, Elman, Mansimov, Yi Zhang, Dan Roth

TL;DR
This paper introduces Label Semantic Aware Pre-training (LSAP), a method that enhances few-shot text classification by incorporating label semantics into pre-trained models through domain-specific sentence-label pair generation.
Contribution
The paper presents LSAP, a novel pre-training approach that leverages label semantics and automatic sentence-label pair creation to improve few-shot text classification performance.
Findings
LSAP significantly outperforms existing models in few-shot settings.
LSAP maintains competitive performance in high-resource scenarios.
Automatic data filtering effectively creates useful sentence-label pairs.
Abstract
In text classification tasks, useful information is encoded in the label names. Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction. However, use of label-semantics during pre-training has not been extensively explored. We therefore propose Label Semantic Aware Pre-training (LSAP) to improve the generalization and data efficiency of text classification systems. LSAP incorporates label semantics into pre-trained generative models (T5 in our case) by performing secondary pre-training on labeled sentences from a variety of domains. As domain-general pre-training requires large amounts of data, we develop a filtering and labeling pipeline to automatically create sentence-label pairs from unlabeled text. We perform experiments on intent (ATIS, Snips, TOPv2) and topic classification (AG News, Yahoo!…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
MethodsAttentive Walk-Aggregating Graph Neural Network
