Revisiting Self-Training for Few-Shot Learning of Language Model

Yiming Chen; Yan Zhang; Chen Zhang; Grandee Lee; Ran Cheng; and; Haizhou Li

arXiv:2110.01256·cs.CL·October 5, 2021

Revisiting Self-Training for Few-Shot Learning of Language Model

Yiming Chen, Yan Zhang, Chen Zhang, Grandee Lee, Ran Cheng, and, Haizhou Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces SFLM, a prompt-based self-training method that leverages unlabeled data with augmentation techniques to improve few-shot language model classification, outperforming existing methods.

Contribution

It presents a novel self-training approach using dual augmentations for prompt-based few-shot learning, achieving state-of-the-art results with minimal unlabeled data.

Findings

01

Outperforms existing supervised and semi-supervised methods on multiple benchmarks.

02

Robust across different augmentation techniques, model sizes, and transfer settings.

03

Requires only a few unlabeled in-domain samples for effective learning.

Abstract

As unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. The question is how to effectively make use of such data. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM. Given two views of a text sample via weak and strong augmentation techniques, SFLM generates a pseudo label on the weakly augmented version. Then, the model predicts the same pseudo label when fine-tuned with the strongly augmented version. This simple approach is shown to outperform other state-of-the-art supervised and semi-supervised counterparts on six sentence classification and six sentence-pair classification benchmarking tasks. In addition, SFLM only relies on a few in-domain unlabeled data. We conduct a comprehensive analysis to demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

matthewcym/sflm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning