Overcoming Catastrophic Forgetting in Massively Multilingual Continual   Learning

Genta Indra Winata; Lingjue Xie; Karthik Radhakrishnan; Shijie Wu,; Xisen Jin; Pengxiang Cheng; Mayank Kulkarni; Daniel Preotiuc-Pietro

arXiv:2305.16252·cs.CL·May 26, 2023·1 cites

Overcoming Catastrophic Forgetting in Massively Multilingual Continual Learning

Genta Indra Winata, Lingjue Xie, Karthik Radhakrishnan, Shijie Wu,, Xisen Jin, Pengxiang Cheng, Mayank Kulkarni, Daniel Preotiuc-Pietro

PDF

Open Access

TL;DR

This paper addresses catastrophic forgetting in multilingual continual learning by proposing LR ADJUST, a simple learning rate scheduling method that effectively preserves past knowledge across 51 languages and multiple tasks.

Contribution

The paper introduces LR ADJUST, a novel learning rate scheduling technique that mitigates catastrophic forgetting in massively multilingual continual learning settings.

Findings

01

LR ADJUST effectively reduces forgetting across languages.

02

The method works well with multiple continual learning approaches.

03

Insights into the dynamics of forgetting in multilingual models are provided.

Abstract

Real-life multilingual systems should be able to efficiently incorporate new languages as data distributions fed to the system evolve and shift over time. To do this, systems need to handle the issue of catastrophic forgetting, where the model performance drops for languages or tasks seen further in its past. In this paper, we study catastrophic forgetting, as well as methods to minimize this, in a massively multilingual continual learning framework involving up to 51 languages and covering both classification and sequence labeling tasks. We present LR ADJUST, a learning rate scheduling method that is simple, yet effective in preserving new information without strongly overwriting past knowledge. Furthermore, we show that this method is effective across multiple continual learning approaches. Finally, we provide further insights into the dynamics of catastrophic forgetting in this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · COVID-19 diagnosis using AI