Cross-lingual Language Model Pretraining
Guillaume Lample, Alexis Conneau

TL;DR
This paper introduces cross-lingual pretraining methods for language models, achieving state-of-the-art results in cross-lingual tasks, machine translation, and classification across multiple languages.
Contribution
It proposes unsupervised and supervised cross-lingual pretraining techniques, extending generative pretraining to multiple languages with new objectives.
Findings
Achieved 4.9% accuracy improvement on XNLI
Surpassed previous BLEU scores in machine translation tasks
Set new state-of-the-art BLEU scores for Romanian-English translation
Abstract
Recent studies have demonstrated the efficiency of generative pretraining for English natural language understanding. In this work, we extend this approach to multiple languages and show the effectiveness of cross-lingual pretraining. We propose two methods to learn cross-lingual language models (XLMs): one unsupervised that only relies on monolingual data, and one supervised that leverages parallel data with a new cross-lingual language model objective. We obtain state-of-the-art results on cross-lingual classification, unsupervised and supervised machine translation. On XNLI, our approach pushes the state of the art by an absolute gain of 4.9% accuracy. On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗FacebookAI/xlm-clm-ende-1024model· 26 dl26 dl
- 🤗FacebookAI/xlm-clm-enfr-1024model· 346 dl346 dl
- 🤗FacebookAI/xlm-mlm-100-1280model· 86 dl· ♡ 486 dl♡ 4
- 🤗FacebookAI/xlm-mlm-17-1280model· 39 dl· ♡ 239 dl♡ 2
- 🤗FacebookAI/xlm-mlm-en-2048model· 131k dl· ♡ 1131k dl♡ 1
- 🤗FacebookAI/xlm-mlm-ende-1024model· 2.9k dl· ♡ 12.9k dl♡ 1
- 🤗FacebookAI/xlm-mlm-enfr-1024model· 47 dl47 dl
- 🤗FacebookAI/xlm-mlm-enro-1024model· 17 dl17 dl
- 🤗FacebookAI/xlm-mlm-tlm-xnli15-1024model· 37 dl· ♡ 137 dl♡ 1
- 🤗FacebookAI/xlm-mlm-xnli15-1024model· 22 dl22 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsLinear Layer · Multi-Head Attention · Residual Connection · Attention Dropout · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · Softmax · Dropout
