The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual   Machine Translation

Naman Goyal; Cynthia Gao; Vishrav Chaudhary; Peng-Jen Chen; Guillaume; Wenzek; Da Ju; Sanjana Krishnan; Marc'Aurelio Ranzato; Francisco Guzman,; Angela Fan

arXiv:2106.03193·cs.CL·June 8, 2021·82 cites

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Naman Goyal, Cynthia Gao, Vishrav Chaudhary, Peng-Jen Chen, Guillaume, Wenzek, Da Ju, Sanjana Krishnan, Marc'Aurelio Ranzato, Francisco Guzman,, Angela Fan

PDF

Open Access 2 Repos 10 Models 5 Datasets

TL;DR

The FLORES-101 benchmark provides a high-quality, multilingual evaluation dataset with 3001 sentences across 101 languages, enabling better assessment of low-resource and multilingual machine translation models.

Contribution

We introduce FLORES-101, a comprehensive, professionally translated benchmark dataset for low-resource and multilingual machine translation evaluation.

Findings

01

Enables evaluation of many-to-many multilingual translation systems

02

Provides high-quality, multilingual aligned translations for 101 languages

03

Fosters progress in low-resource machine translation research

Abstract

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification