LiST: Lite Prompted Self-training Makes Parameter-Efficient Few-shot   Learners

Yaqing Wang; Subhabrata Mukherjee; Xiaodong Liu; Jing Gao; Ahmed; Hassan Awadallah; Jianfeng Gao

arXiv:2110.06274·cs.CL·May 20, 2022·1 cites

LiST: Lite Prompted Self-training Makes Parameter-Efficient Few-shot Learners

Yaqing Wang, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed, Hassan Awadallah, Jianfeng Gao

PDF

Open Access 1 Repo

TL;DR

LiST is a lightweight prompt-based self-training method that enhances parameter-efficient few-shot learning by leveraging unlabeled data and minimal task-specific parameters, outperforming traditional fine-tuning and GPT-3 in NLU tasks.

Contribution

LiST introduces a novel combination of self-training and lightweight fine-tuning with minimal parameters, significantly improving few-shot learning performance.

Findings

01

LiST improves performance by 35% over classic fine-tuning.

02

LiST reduces trainable parameters by 96%.

03

LiST outperforms GPT-3 in few-shot NLU tasks by 33%.

Abstract

We present a new method LiST is short for Lite Prompted Self-Training for parameter-efficient fine-tuning of large pre-trained language models (PLMs) for few-shot learning. LiST improves over recent methods that adopt prompt-based fine-tuning (FN) using two key techniques. The first is the use of self-training to leverage large amounts of unlabeled data for prompt-based FN in few-shot settings. We use self-training in conjunction with meta-learning for re-weighting noisy pseudo-prompt labels. Self-training is expensive as it requires updating all the model parameters repetitively. Therefore, we use a second technique for light-weight fine-tuning where we introduce a small number of task-specific parameters that are fine-tuned during self-training while keeping the PLM encoder frozen. Our experiments show that LiST can effectively leverage unlabeled data to improve the model performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

microsoft/list
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

Methods15 Ways to Contact How can i speak to someone at Delta Airlines · Linear Layer · Cosine Annealing · Dropout · Softmax · Layer Normalization · Dense Connections · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Linear Warmup With Cosine Annealing