Meta-training with Demonstration Retrieval for Efficient Few-shot   Learning

Aaron Mueller; Kanika Narang; Lambert Mathias; Qifan Wang; Hamed; Firooz

arXiv:2307.00119·cs.CL·July 4, 2023

Meta-training with Demonstration Retrieval for Efficient Few-shot Learning

Aaron Mueller, Kanika Narang, Lambert Mathias, Qifan Wang, Hamed, Firooz

PDF

Open Access

TL;DR

This paper introduces a novel meta-training approach that combines demonstration retrieval with parameter-efficient models, significantly improving few-shot learning across multiple NLP tasks while reducing computational requirements.

Contribution

It is the first to integrate retrieval with meta-training, using DPR to retrieve demonstrations from many tasks, enhancing generalization and efficiency in few-shot NLP learning.

Findings

01

Outperforms existing parameter-efficient and retrieval-augmented methods on QA, NLI, and text classification tasks.

02

Enables quick meta-training and fine-tuning on a single GPU.

03

Demonstration retrieval improves model generalization across diverse tasks.

Abstract

Large language models show impressive results on few-shot NLP tasks. However, these models are memory and computation-intensive. Meta-training allows one to leverage smaller models for few-shot generalization in a domain-general and task-agnostic manner; however, these methods alone results in models that may not have sufficient parameterization or knowledge to adapt quickly to a large variety of tasks. To overcome this issue, we propose meta-training with demonstration retrieval, where we use a dense passage retriever to retrieve semantically similar labeled demonstrations to each example for more varied supervision. By separating external knowledge from model parameters, we can use meta-training to train parameter-efficient models that generalize well on a larger variety of tasks. We construct a meta-training set from UnifiedQA and CrossFit, and propose a demonstration bank based on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications