Towards Robust and Efficient Continual Language Learning

Adam Fisch; Amal Rannen-Triki; Razvan Pascanu; J\"org Bornschein,; Angeliki Lazaridou; Elena Gribovskaya; Marc'Aurelio Ranzato

arXiv:2307.05741·cs.CL·July 13, 2023·1 cites

Towards Robust and Efficient Continual Language Learning

Adam Fisch, Amal Rannen-Triki, Razvan Pascanu, J\"org Bornschein,, Angeliki Lazaridou, Elena Gribovskaya, Marc'Aurelio Ranzato

PDF

Open Access

TL;DR

This paper introduces a new benchmark for continual language learning to evaluate models' ability to transfer knowledge positively while avoiding negative transfer, proposing a selective initialization strategy for improved adaptation.

Contribution

It presents a comprehensive benchmark for various transfer scenarios and proposes a simple, effective method leveraging selective checkpoint initialization for continual learning.

Findings

01

Benchmark covers diverse transfer scenarios.

02

Selective initialization improves transfer effectiveness.

03

Framework helps analyze positive and negative transfer in language models.

Abstract

As the application space of language models continues to evolve, a natural question to ask is how we can quickly adapt models to new tasks. We approach this classic question from a continual learning perspective, in which we aim to continue fine-tuning models trained on past tasks on new tasks, with the goal of "transferring" relevant knowledge. However, this strategy also runs the risk of doing more harm than good, i.e., negative transfer. In this paper, we construct a new benchmark of task sequences that target different possible transfer scenarios one might face, such as a sequence of tasks with high potential of positive transfer, high potential for negative transfer, no expected effect, or a mixture of each. An ideal learner should be able to maximally exploit information from all tasks that have any potential for positive transfer, while also avoiding the negative effects of any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling