Cross-Lingual Language Model Meta-Pretraining
Zewen Chi, Heyan Huang, Luyang Liu, Yu Bai, Xian-Ling Mao

TL;DR
This paper introduces a two-phase meta-pretraining approach for cross-lingual language models, enhancing both their generalization to downstream tasks and their transferability across languages by separating these learning objectives.
Contribution
It proposes a novel meta-pretraining phase before cross-lingual pretraining, improving the balance between generalization and transferability in multilingual models.
Findings
Improved cross-lingual transferability and generalization.
Better-aligned multilingual representations.
Enhanced performance on downstream tasks.
Abstract
The success of pretrained cross-lingual language models relies on two essential abilities, i.e., generalization ability for learning downstream tasks in a source language, and cross-lingual transferability for transferring the task knowledge to other languages. However, current methods jointly learn the two abilities in a single-phase cross-lingual pretraining process, resulting in a trade-off between generalization and cross-lingual transfer. In this paper, we propose cross-lingual language model meta-pretraining, which learns the two abilities in different training phases. Our method introduces an additional meta-pretraining phase before cross-lingual pretraining, where the model learns generalization ability on a large-scale monolingual corpus. Then, the model focuses on learning cross-lingual transfer on a multilingual corpus. Experimental results show that our method improves both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
