Revisiting Self-Training for Few-Shot Learning of Language Model
Yiming Chen, Yan Zhang, Chen Zhang, Grandee Lee, Ran Cheng, and, Haizhou Li

TL;DR
This paper introduces SFLM, a prompt-based self-training method that leverages unlabeled data with augmentation techniques to improve few-shot language model classification, outperforming existing methods.
Contribution
It presents a novel self-training approach using dual augmentations for prompt-based few-shot learning, achieving state-of-the-art results with minimal unlabeled data.
Findings
Outperforms existing supervised and semi-supervised methods on multiple benchmarks.
Robust across different augmentation techniques, model sizes, and transfer settings.
Requires only a few unlabeled in-domain samples for effective learning.
Abstract
As unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. The question is how to effectively make use of such data. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM. Given two views of a text sample via weak and strong augmentation techniques, SFLM generates a pseudo label on the weakly augmented version. Then, the model predicts the same pseudo label when fine-tuned with the strongly augmented version. This simple approach is shown to outperform other state-of-the-art supervised and semi-supervised counterparts on six sentence classification and six sentence-pair classification benchmarking tasks. In addition, SFLM only relies on a few in-domain unlabeled data. We conduct a comprehensive analysis to demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
