Beyond English-Centric Multilingual Machine Translation

Angela Fan; Shruti Bhosale; Holger Schwenk; Zhiyi Ma; Ahmed El-Kishky,; Siddharth Goyal; Mandeep Baines; Onur Celebi; Guillaume Wenzek; Vishrav; Chaudhary; Naman Goyal; Tom Birch; Vitaliy Liptchinsky; Sergey Edunov,; Edouard Grave; Michael Auli; Armand Joulin

arXiv:2010.11125·cs.CL·October 22, 2020·468 cites

Beyond English-Centric Multilingual Machine Translation

Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky,, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav, Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov,, Edouard Grave, Michael Auli, Armand Joulin

PDF

Open Access 5 Repos 10 Models 4 Datasets

TL;DR

This paper develops a true Many-to-Many multilingual translation model for 100 languages, significantly improving direct non-English translation quality and providing open-source datasets and models for broader research.

Contribution

It introduces a comprehensive training dataset and a scalable model architecture for direct translation between any pair of 100 languages, moving beyond English-centric approaches.

Findings

01

Over 10 BLEU gain in non-English translation directions

02

Competitive performance with top WMT systems

03

Open-source datasets and models for community use

Abstract

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages. However, much of this work is English-Centric by training only on data which was translated from or to English. While this is supported by large sources of training data, it does not reflect translation needs worldwide. In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build and open source a training dataset that covers thousands of language directions with supervised data, created through large-scale mining. Then, we explore how to effectively increase model capacity through a combination of dense scaling and language-specific sparse parameters to create high quality models. Our focus on non-English-Centric models brings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification