Learning to Route for Dynamic Adapter Composition in Continual Learning   with Language Models

Vladimir Araujo; Marie-Francine Moens; Tinne Tuytelaars

arXiv:2408.09053·cs.LG·October 31, 2024

Learning to Route for Dynamic Adapter Composition in Continual Learning with Language Models

Vladimir Araujo, Marie-Francine Moens, Tinne Tuytelaars

PDF

Open Access

TL;DR

This paper introduces L2R, a novel routing method for continual learning with language models that isolates task-specific modules and learns to compose them effectively, improving performance and generalization.

Contribution

L2R is the first approach to isolate PEFT module training and learn to compose modules via a router network using a small memory of past tasks.

Findings

01

L2R outperforms existing methods in continual learning benchmarks.

02

Isolating module training reduces interference and improves task specialization.

03

Learned routing enhances module composition and overall model performance.

Abstract

Parameter-efficient fine-tuning (PEFT) methods are increasingly used with pre-trained language models (PLMs) for continual learning (CL). These methods typically involve training a PEFT module for each new task and employing similarity-based selection to route modules during inference. However, they face two major limitations: 1) interference during module training with already learned modules and 2) suboptimal routing when composing modules. In this paper, we present L2R, a method that isolates the training of new PEFT modules to ensure their task specialization. L2R then learns to compose the learned modules by training a network of routers that leverages a small memory containing examples of previously seen tasks. We evaluate our method in two CL setups using various benchmarks. Our results demonstrate that L2R provides an effective composition of PEFT modules, leading to improved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis