Harnessing Multilinguality in Unsupervised Machine Translation for Rare   Languages

Xavier Garcia; Aditya Siddhant; Orhan Firat; Ankur P. Parikh

arXiv:2009.11201·cs.CL·March 15, 2021

Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages

Xavier Garcia, Aditya Siddhant, Orhan Firat, Ankur P. Parikh

PDF

TL;DR

This paper demonstrates that leveraging multilinguality with a three-stage training scheme significantly improves unsupervised machine translation for low-resource languages, surpassing previous methods and matching supervised models.

Contribution

The authors introduce a novel multilingual unsupervised translation model for five low-resource languages, outperforming existing baselines and matching supervised performance for Nepali-English.

Findings

01

Achieved up to 14.4 BLEU improvements over baselines.

02

Outperformed various supervised WMT models for multiple language pairs.

03

Proved robustness of the approach under different data quality conditions.

Abstract

Unsupervised translation has reached impressive performance on resource-rich language pairs such as English-French and English-German. However, early studies have shown that in more realistic settings involving low-resource, rare languages, unsupervised translation performs poorly, achieving less than 3.0 BLEU. In this work, we show that multilinguality is critical to making unsupervised systems practical for low-resource settings. In particular, we present a single model for 5 low-resource languages (Gujarati, Kazakh, Nepali, Sinhala, and Turkish) to and from English directions, which leverages monolingual and auxiliary parallel data from other high-resource language pairs via a three-stage training scheme. We outperform all current state-of-the-art unsupervised baselines for these languages, achieving gains of up to 14.4 BLEU. Additionally, we outperform a large collection of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.