Cross-Lingual Language Model Meta-Pretraining

Zewen Chi; Heyan Huang; Luyang Liu; Yu Bai; Xian-Ling Mao

arXiv:2109.11129·cs.CL·September 24, 2021·1 cites

Cross-Lingual Language Model Meta-Pretraining

Zewen Chi, Heyan Huang, Luyang Liu, Yu Bai, Xian-Ling Mao

PDF

Open Access

TL;DR

This paper introduces a two-phase meta-pretraining approach for cross-lingual language models, enhancing both their generalization to downstream tasks and their transferability across languages by separating these learning objectives.

Contribution

It proposes a novel meta-pretraining phase before cross-lingual pretraining, improving the balance between generalization and transferability in multilingual models.

Findings

01

Improved cross-lingual transferability and generalization.

02

Better-aligned multilingual representations.

03

Enhanced performance on downstream tasks.

Abstract

The success of pretrained cross-lingual language models relies on two essential abilities, i.e., generalization ability for learning downstream tasks in a source language, and cross-lingual transferability for transferring the task knowledge to other languages. However, current methods jointly learn the two abilities in a single-phase cross-lingual pretraining process, resulting in a trade-off between generalization and cross-lingual transfer. In this paper, we propose cross-lingual language model meta-pretraining, which learns the two abilities in different training phases. Our method introduces an additional meta-pretraining phase before cross-lingual pretraining, where the model learns generalization ability on a large-scale monolingual corpus. Then, the model focuses on learning cross-lingual transfer on a multilingual corpus. Experimental results show that our method improves both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications