Self-Supervised Meta-Learning for Few-Shot Natural Language   Classification Tasks

Trapit Bansal; Rishikesh Jha; Tsendsuren Munkhdalai; Andrew McCallum

arXiv:2009.08445·cs.CL·November 17, 2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

PDF

1 Repo

TL;DR

This paper introduces a self-supervised meta-learning approach that generates diverse NLP tasks from unlabeled text to improve few-shot learning, outperforming traditional pre-training and fine-tuning methods.

Contribution

It proposes a novel self-supervised task generation method for meta-learning in NLP, reducing reliance on labeled tasks and enhancing few-shot generalization capabilities.

Findings

01

Meta-training on generated tasks improves few-shot performance.

02

Combining self-supervised and supervised tasks yields higher accuracy.

03

Outperforms standard pre-training followed by fine-tuning on 17 NLP tasks.

Abstract

Self-supervised pre-training of transformer models has revolutionized NLP applications. Such pre-training with language modeling objectives provides a useful initial point for parameters that generalize well to new tasks with fine-tuning. However, fine-tuning is still data inefficient -- when there are few labeled examples, accuracy can be low. Data efficiency can be improved by optimizing pre-training directly for future fine-tuning with few examples; this can be treated as a meta-learning problem. However, standard meta-learning techniques require many training tasks in order to generalize; unfortunately, finding a diverse set of such supervised tasks is usually difficult. This paper proposes a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text. This is achieved using a cloze-style objective, but creating separate multi-class…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iesl/metanlp
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.