Transfer Learning across Low-Resource, Related Languages for Neural   Machine Translation

Toan Q. Nguyen; David Chiang

arXiv:1708.09803·cs.CL·September 22, 2017·113 cites

Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation

Toan Q. Nguyen, David Chiang

PDF

Open Access

TL;DR

This paper introduces a transfer learning method for neural machine translation that leverages vocabulary overlap between related low-resource languages, resulting in improved translation quality when combined with BPE preprocessing.

Contribution

The authors propose a transfer learning approach that exploits source vocabulary overlap in low-resource language pairs, enhancing neural translation performance beyond existing methods.

Findings

01

Transfer learning yields up to 4.3 BLEU improvements.

02

Vocabulary overlap via BPE enhances transfer effectiveness.

03

Transfer helps more with BPE-based models than word-based models.

Abstract

We present a simple method to improve neural translation of a low-resource language pair using parallel data from a related, also low-resource, language pair. The method is based on the transfer method of Zoph et al., but whereas their method ignores any source vocabulary overlap, ours exploits it. First, we split words using Byte Pair Encoding (BPE) to increase vocabulary overlap. Then, we train a model on the first language pair and transfer its parameters, including its source word embeddings, to another model and continue training on the second language pair. Our experiments show that transfer learning helps word-based translation only slightly, but when used on top of a much stronger BPE baseline, it yields larger improvements of up to 4.3 BLEU.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis