Meta-Learning the Difference: Preparing Large Language Models for   Efficient Adaptation

Zejiang Hou; Julian Salazar; George Polovets

arXiv:2207.03509·cs.CL·July 11, 2022

Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation

Zejiang Hou, Julian Salazar, George Polovets

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel meta-learning approach that prepares large language models for efficient domain- and task-specific adaptation, reducing training time and data requirements.

Contribution

It proposes a dynamic low-rank reparameterization and architecture controller to learn the difference between general and adapted models, enhancing adaptation efficiency.

Findings

01

Improves adaptation time and performance in few-shot and low-resource tasks

02

Outperforms direct finetuning and domain-adaptive pretraining

03

Task-adaptive reparameterization and model search components are effective

Abstract

Large pretrained language models (PLMs) are often domain- or task-adapted via fine-tuning or prompting. Finetuning requires modifying all of the parameters and having enough data to avoid overfitting while prompting requires no training and few examples but limits performance. Instead, we prepare PLMs for data- and parameter-efficient adaptation by learning to learn the difference between general and adapted PLMs. This difference is expressed in terms of model weights and sublayer structure through our proposed dynamic low-rank reparameterization and learned architecture controller. Experiments on few-shot dialogue completion, low-resource abstractive summarization, and multi-domain language modeling show improvements in adaptation time and performance over direct finetuning or preparation via domain-adaptive pretraining. Ablations show our task-adaptive reparameterization (TARP) and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-research/meta-learning-the-difference
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning