Explicit Cross-lingual Pre-training for Unsupervised Machine Translation

Shuo Ren; Yu Wu; Shujie Liu; Ming Zhou; Shuai Ma

arXiv:1909.00180·cs.CL·September 4, 2019·1 cites

Explicit Cross-lingual Pre-training for Unsupervised Machine Translation

Shuo Ren, Yu Wu, Shujie Liu, Ming Zhou, Shuai Ma

PDF

Open Access

TL;DR

This paper introduces an explicit cross-lingual pre-training method for unsupervised machine translation that uses cross-lingual n-gram embeddings and a novel CMLM model to improve translation quality.

Contribution

It proposes a new pre-training approach that incorporates explicit cross-lingual signals via n-gram translation, enhancing unsupervised translation performance.

Findings

01

Significant improvement in unsupervised translation quality

02

Effective integration of explicit cross-lingual information

03

Demonstrated benefits of n-gram based pre-training

Abstract

Pre-training has proven to be effective in unsupervised machine translation due to its ability to model deep context information in cross-lingual scenarios. However, the cross-lingual information obtained from shared BPE spaces is inexplicit and limited. In this paper, we propose a novel cross-lingual pre-training method for unsupervised machine translation by incorporating explicit cross-lingual training signals. Specifically, we first calculate cross-lingual n-gram embeddings and infer an n-gram translation table from them. With those n-gram translation pairs, we propose a new pre-training model called Cross-lingual Masked Language Model (CMLM), which randomly chooses source n-grams in the input text stream and predicts their translation candidates at each time step. Experiments show that our method can incorporate beneficial cross-lingual information into pre-trained models. Taking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsByte Pair Encoding