Learning to Few-Shot Learn Across Diverse Natural Language   Classification Tasks

Trapit Bansal; Rishikesh Jha; Andrew McCallum

arXiv:1911.03863·cs.CL·November 17, 2020

Learning to Few-Shot Learn Across Diverse Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Andrew McCallum

PDF

2 Repos

TL;DR

This paper introduces LEOPARD, a meta-learning method that enables transformer models to generalize to diverse NLP classification tasks with minimal labeled data, significantly improving few-shot learning performance.

Contribution

LEOPARD is a novel optimization-based meta-learning approach that handles tasks with varying class numbers, enhancing few-shot NLP classification across diverse domains.

Findings

01

LEOPARD outperforms baselines on 17 NLP tasks.

02

Achieves 14.5% relative accuracy gain with 4 examples per label.

03

Generalizes well to unseen tasks with minimal data.

Abstract

Self-supervised pre-training of transformer models has shown enormous success in improving performance on a number of downstream tasks. However, fine-tuning on a new task still requires large amounts of task-specific labelled data to achieve good performance. We consider this problem of learning to generalize to new tasks with few examples as a meta-learning problem. While meta-learning has shown tremendous progress in recent years, its application is still limited to simulated problems or problems with limited diversity across tasks. We develop a novel method, LEOPARD, which enables optimization-based meta-learning across tasks with different number of classes, and evaluate different methods on generalization to diverse NLP classification tasks. LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax