Massively Multilingual Neural Machine Translation
Roee Aharoni, Melvin Johnson, Orhan Firat

TL;DR
This paper explores training large-scale multilingual neural machine translation models supporting up to 102 languages, demonstrating improved performance and trade-offs in quality and resource efficiency, especially in low-resource scenarios.
Contribution
It presents the first extensive study of massively multilingual NMT models with up to 102 languages, analyzing training setups and demonstrating superior performance over previous methods.
Findings
Massively multilingual models outperform previous state-of-the-art in low-resource settings.
Models support up to 102 languages with promising translation quality.
Effective trade-offs between model size, number of languages, and translation performance.
Abstract
Multilingual neural machine translation (NMT) enables training a single model that supports translation from multiple source languages into multiple target languages. In this paper, we push the limits of multilingual NMT in terms of number of languages being used. We perform extensive experiments in training massively multilingual NMT models, translating up to 102 languages to and from English within a single model. We explore different setups for training such models and analyze the trade-offs between translation quality and various modeling decisions. We report results on the publicly available TED talks multilingual corpus where we show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages. Our experiments on a large-scale dataset with 102 languages to and from English…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
