Multilingual Translation with Extensible Multilingual Pretraining and Finetuning
Yuqing Tang, Chau Tran, Xian Li, Peng-Jen Chen, Naman Goyal, Vishrav, Chaudhary, Jiatao Gu, Angela Fan

TL;DR
This paper introduces a method for creating multilingual translation models through multilingual finetuning of pretrained models, enabling support for more languages and improved translation performance, especially for low-resource languages.
Contribution
It demonstrates that multilingual finetuning of pretrained models can extend language coverage and improve translation quality without performance loss, and introduces the ML50 benchmark for standardized evaluation.
Findings
Multilingual finetuning outperforms bilingual and from-scratch models in BLEU score.
Extended mBART to support 50 languages without performance loss.
Created ML50 benchmark for reproducible multilingual translation research.
Abstract
Recent work demonstrates the potential of multilingual pretraining of creating one model that can be used for various tasks in different languages. Previous work in multilingual pretraining has demonstrated that machine translation systems can be created by finetuning on bitext. In this work, we show that multilingual translation models can be created through multilingual finetuning. Instead of finetuning on one direction, a pretrained model is finetuned on many directions at the same time. Compared to multilingual models trained from scratch, starting from pretrained models incorporates the benefits of large quantities of unlabeled monolingual data, which is particularly important for low resource languages where bitext is not available. We demonstrate that pretrained models can be extended to incorporate additional languages without loss of performance. We double the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Narrativa/mbart-large-50-finetuned-opus-en-pt-translationmodel· 16 dl· ♡ 1216 dl♡ 12
- 🤗Narrativa/mbart-large-50-finetuned-opus-pt-en-translationmodel· 33 dl· ♡ 533 dl♡ 5
- 🤗facebook/mbart-large-50-many-to-many-mmtmodel· 158k dl· ♡ 406158k dl♡ 406
- 🤗facebook/mbart-large-50-many-to-one-mmtmodel· 2.7k dl· ♡ 672.7k dl♡ 67
- 🤗facebook/mbart-large-50-one-to-many-mmtmodel· 203k dl· ♡ 39203k dl♡ 39
- 🤗facebook/mbart-large-50model· 25k dl· ♡ 16625k dl♡ 166
- 🤗nguyenvulebinh/mbart-large-50-latin-onlymodel· 8 dl8 dl
- 🤗sanjitaa/mbart-many-to-manymodel· 9 dl9 dl
- 🤗Nishant24/mbart-finetuned-hi-to-en_Siddha_Yoga_Text_by_Nishantmodel· 5 dl5 dl
- 🤗SnypzZz/Llama2-13b-Language-translatemodel· 173 dl· ♡ 127173 dl♡ 127
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsmBART
