Improving the Lexical Ability of Pretrained Language Models for   Unsupervised Neural Machine Translation

Alexandra Chronopoulou; Dario Stojanovski; Alexander Fraser

arXiv:2103.10531·cs.CL·April 15, 2021

Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation

Alexandra Chronopoulou, Dario Stojanovski, Alexander Fraser

PDF

1 Repo

TL;DR

This paper introduces a method to improve unsupervised neural machine translation by incorporating lexical-level information through cross-lingual subword embeddings, leading to better translation quality especially for low-resource languages.

Contribution

It proposes enhancing bilingual masked language models with lexical information via cross-lingual subword embeddings, improving UNMT performance.

Findings

01

Up to 4.5 BLEU improvement in UNMT

02

Enhanced bilingual lexicon induction results

03

Better alignment of lexical representations

Abstract

Successful methods for unsupervised neural machine translation (UNMT) employ crosslingual pretraining via self-supervision, often in the form of a masked language modeling or a sequence generation task, which requires the model to align the lexical- and high-level representations of the two languages. While cross-lingual pretraining works for similar languages with abundant corpora, it performs poorly in low-resource and distant languages. Previous research has shown that this is because the representations are not sufficiently aligned. In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings. Empirical results demonstrate improved performance both on UNMT (up to 4.5 BLEU) and bilingual lexicon induction using our method compared to a UNMT baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alexandra-chron/lexical_xlm_relm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.