ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by   Learning to Scale

Markus Frohmann; Carolin Holtermann; Shahed Masoudian; Anne Lauscher,; Navid Rekabsaz

arXiv:2310.01217·cs.LG·May 20, 2024

ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale

Markus Frohmann, Carolin Holtermann, Shahed Masoudian, Anne Lauscher,, Navid Rekabsaz

PDF

Open Access 1 Repo 1 Video

TL;DR

ScaLearn introduces a simple, parameter-efficient method for multi-task transfer learning using linear scaling of source adapters, outperforming existing methods with minimal additional parameters across multiple benchmarks.

Contribution

The paper proposes ScaLearn, a novel two-stage multi-task learning approach that uses minimal scaling parameters for effective transfer, significantly reducing parameter overhead compared to prior methods.

Findings

01

Outperforms strong baselines on GLUE, SuperGLUE, and HumSet benchmarks.

02

Uses only about 0.35% of parameters of AdapterFusion for transfer.

03

Maintains competitive performance with as few as 8 transfer parameters per target task.

Abstract

Multi-task learning (MTL) has shown considerable practical benefits, particularly when using language models (LMs). While this is commonly achieved by learning $n$ tasks under a joint optimization procedure, some methods, such as AdapterFusion, divide the problem into two stages: (i) task learning, where knowledge specific to a task is encapsulated within sets of parameters (e.g., adapters), and (ii) transfer, where this already learned knowledge is leveraged for a target task. This separation of concerns provides numerous benefits (e.g., promoting reusability). However, current two-stage MTL introduces a substantial number of additional parameters. We address this issue by leveraging the usefulness of linearly scaling the output representations of source adapters for transfer learning. We introduce ScaLearn, a simple and highly parameter-efficient two-stage MTL method that capitalizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cpjku/scalearn
jaxOfficial

Videos

ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to Scale· underline

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications

MethodsFLIP