Learning to Generate Task-Specific Adapters from Task Description

Qinyuan Ye; Xiang Ren

arXiv:2101.00420·cs.CL·June 16, 2021

Learning to Generate Task-Specific Adapters from Task Description

Qinyuan Ye, Xiang Ren

PDF

Open Access 1 Repo

TL;DR

This paper introduces Hypter, a hypernetwork-based framework that generates task-specific adapters from descriptions, enhancing the generalization of text-to-text transformers to unseen tasks, with significant improvements demonstrated on benchmark datasets.

Contribution

Hypter is a novel framework that trains a hypernetwork to produce adapters from task descriptions, improving task generalization over traditional fine-tuning methods.

Findings

01

11.3% improvement on ZEST dataset with BART-Large

02

Outperforms fine-tuning baselines on ZEST and SQuAD datasets

03

Enhances generalization to unseen tasks in NLP

Abstract

Pre-trained text-to-text transformers such as BART have achieved impressive performance across a range of NLP tasks. Recent study further shows that they can learn to generalize to novel tasks, by including task descriptions as part of the source sequence and training the model with (source, target) examples. At test time, these fine-tuned models can make inferences on new tasks using the new task descriptions as part of the input. However, this approach has potential limitations, as the model learns to solve individual (source, target) examples (i.e., at the instance level), instead of learning to solve tasks by taking all examples within a task as a whole (i.e., at the task level). To this end, we introduce Hypter, a framework that improves text-to-text transformer's generalization ability to unseen tasks by training a hypernetwork to generate task-specific, light-weight adapters from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

INK-USC/hypter
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare

MethodsLinear Layer · Dense Connections · Softmax · Dropout · Byte Pair Encoding · Attention Is All You Need · Adam · Layer Normalization · Multi-Head Attention · Refunds@Expedia|||How do I get a full refund from Expedia?