Reusing a Pretrained Language Model on Languages with Limited Corpora   for Unsupervised NMT

Alexandra Chronopoulou; Dario Stojanovski; Alexander Fraser

arXiv:2009.07610·cs.CL·October 7, 2020

Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT

Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser

PDF

1 Repo

TL;DR

This paper introduces RE-LM, a method that reuses a pretrained language model on a high-resource language, fine-tunes it on both languages, and extends its vocabulary to improve unsupervised neural machine translation for low-resource languages, achieving significant BLEU score improvements.

Contribution

The paper proposes a novel vocabulary extension method and a fine-tuning approach to effectively reuse pretrained LMs for unsupervised NMT involving low-resource languages.

Findings

01

RE-LM outperforms XLM in English-Macedonian and English-Albanian translation tasks.

02

Achieves over +8.3 BLEU points across four translation directions.

03

Effective vocabulary extension is key to reusing pretrained LMs for low-resource languages.

Abstract

Using a language model (LM) pretrained on two languages with large monolingual data in order to initialize an unsupervised neural machine translation (UNMT) system yields state-of-the-art results. When limited data is available for one language, however, this method leads to poor translations. We present an effective approach that reuses an LM that is pretrained only on the high-resource language. The monolingual LM is fine-tuned on both languages and is then used to initialize a UNMT model. To reuse the pretrained LM, we have to modify its predefined vocabulary, to account for the new language. We therefore propose a novel vocabulary extension method. Our approach, RE-LM, outperforms a competitive cross-lingual pretraining model (XLM) in English-Macedonian (En-Mk) and English-Albanian (En-Sq), yielding more than +8.3 BLEU points for all four translation directions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alexandra-chron/relm_unmt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.