Continual Learning Under Language Shift
Evangelia Gogoulou, Timoth\'ee Lesort, Magnus Boman, Joakim Nivre

TL;DR
This paper investigates continual learning for language models under language shift, analyzing how adding new languages affects model performance and transfer effects, with insights into language similarity and contamination.
Contribution
It provides a detailed analysis of forward and backward transfer effects in continual language learning, highlighting the impact of language order and linguistic similarities.
Findings
Forward transfer is generally positive and order-independent.
Backward transfer varies with language order and characteristics.
Language contamination and syntactic similarity influence transfer effects.
Abstract
The recent increase in data and model scale for language model pre-training has led to huge training costs. In scenarios where new data become available over time, updating a model instead of fully retraining it would therefore provide significant gains. We study the pros and cons of updating a language model when new data comes from new languages -- the case of continual learning under language shift. Starting from a monolingual English language model, we incrementally add data from Danish, Icelandic, and Norwegian to investigate how forward and backward transfer effects depend on pre-training order and characteristics of languages, for three different model sizes. Our results show that, while forward transfer is largely positive and independent of language order, backward transfer can be positive or negative depending on the order and characteristics of new languages. We explore a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecond Language Learning and Teaching
