Improving In-Context Few-Shot Learning via Self-Supervised Training

Mingda Chen; Jingfei Du; Ramakanth Pasunuru; Todor Mihaylov; Srini; Iyer; Veselin Stoyanov; Zornitsa Kozareva

arXiv:2205.01703·cs.CL·June 8, 2022·1 cites

Improving In-Context Few-Shot Learning via Self-Supervised Training

Mingda Chen, Jingfei Du, Ramakanth Pasunuru, Todor Mihaylov, Srini, Iyer, Veselin Stoyanov, Zornitsa Kozareva

PDF

Open Access

TL;DR

This paper introduces an intermediate self-supervised training stage to improve in-context few-shot learning in NLP, demonstrating that it enhances model performance and task adherence.

Contribution

It proposes and evaluates four self-supervised objectives during intermediate training to boost few-shot learning capabilities in NLP models.

Findings

01

Intermediate self-supervision outperforms strong baselines.

02

Diversity and amount of data influence performance.

03

Self-supervision complements human-annotated supervision.

Abstract

Self-supervised pretraining has made few-shot learning possible for many NLP tasks. But the pretraining objectives are not typically adapted specifically for in-context few-shot learning. In this paper, we propose to use self-supervision in an intermediate training stage between pretraining and downstream few-shot usage with the goal to teach the model to perform in-context few shot learning. We propose and evaluate four self-supervised objectives on two benchmarks. We find that the intermediate self-supervision stage produces models that outperform strong baselines. Ablation study shows that several factors affect the downstream performance, such as the amount of training data and the diversity of the self-supervised objectives. Human-annotated cross-task supervision and self-supervision are complementary. Qualitative analysis suggests that the self-supervised-trained models are better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications