TL;DR
This paper introduces a self-supervised meta-learning approach that generates diverse NLP tasks from unlabeled text to improve few-shot learning, outperforming traditional pre-training and fine-tuning methods.
Contribution
It proposes a novel self-supervised task generation method for meta-learning in NLP, reducing reliance on labeled tasks and enhancing few-shot generalization capabilities.
Findings
Meta-training on generated tasks improves few-shot performance.
Combining self-supervised and supervised tasks yields higher accuracy.
Outperforms standard pre-training followed by fine-tuning on 17 NLP tasks.
Abstract
Self-supervised pre-training of transformer models has revolutionized NLP applications. Such pre-training with language modeling objectives provides a useful initial point for parameters that generalize well to new tasks with fine-tuning. However, fine-tuning is still data inefficient -- when there are few labeled examples, accuracy can be low. Data efficiency can be improved by optimizing pre-training directly for future fine-tuning with few examples; this can be treated as a meta-learning problem. However, standard meta-learning techniques require many training tasks in order to generalize; unfortunately, finding a diverse set of such supervised tasks is usually difficult. This paper proposes a self-supervised approach to generate a large, rich, meta-learning task distribution from unlabeled text. This is achieved using a cloze-style objective, but creating separate multi-class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
