Match the Script, Adapt if Multilingual: Analyzing the Effect of Multilingual Pretraining on Cross-lingual Transferability
Yoshinari Fujinuma, Jordan Boyd-Graber, Katharina Kann

TL;DR
This paper investigates how the number and diversity of pretraining languages affect zero-shot cross-lingual transfer in multilingual models, highlighting the importance of model adaptation to leverage more languages effectively.
Contribution
It provides empirical insights into the impact of pretraining language diversity and the role of adaptation in enhancing zero-shot transfer performance.
Findings
Diverse pretraining languages improve zero-shot performance.
Model adaptation amplifies benefits of using more pretraining languages.
Performance plateaus without adaptation when adding related languages.
Abstract
Pretrained multilingual models enable zero-shot learning even for unseen languages, and that performance can be further improved via adaptation prior to finetuning. However, it is unclear how the number of pretraining languages influences a model's zero-shot learning for languages unseen during pretraining. To fill this gap, we ask the following research questions: (1) How does the number of pretraining languages influence zero-shot performance on unseen target languages? (2) Does the answer to that question change with model adaptation? (3) Do the findings for our first question change if the languages used for pretraining are all related? Our experiments on pretraining with related languages indicate that choosing a diverse set of languages is crucial. Without model adaptation, surprisingly, increasing the number of pretraining languages yields better results up to adding related…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Interpreting and Communication in Healthcare
