Transfer Learning for Low-Resource Neural Machine Translation

Barret Zoph; Deniz Yuret; Jonathan May; and Kevin Knight

arXiv:1604.02201·cs.CL·April 11, 2016

Transfer Learning for Low-Resource Neural Machine Translation

Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight

PDF

1 Repo

TL;DR

This paper introduces a transfer learning approach for neural machine translation that significantly enhances translation quality for low-resource languages by leveraging high-resource language models.

Contribution

The authors propose a transfer learning method that initializes low-resource NMT models with parameters from high-resource models, improving BLEU scores substantially.

Findings

01

Improved BLEU scores by an average of 5.6 on four low-resource language pairs.

02

Ensembling and unknown word replacement further increased BLEU by 2.

03

Transfer learning models can re-score and enhance SBMT systems, surpassing previous performance.

Abstract

The encoder-decoder framework for neural machine translation (NMT) has been shown effective in large data scenarios, but is much less effective for low-resource languages. We present a transfer learning method that significantly improves Bleu scores across a range of low-resource languages. Our key idea is to first train a high-resource language pair (the parent model), then transfer some of the learned parameters to the low-resource pair (the child model) to initialize and constrain training. Using our transfer learning method we improve baseline NMT models by an average of 5.6 Bleu on four low-resource language pairs. Ensembling and unknown word replacement add another 2 Bleu which brings the NMT performance on low-resource machine translation close to a strong syntax based machine translation (SBMT) system, exceeding its performance on one language pair. Additionally, using the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

isi-nlp/Zoph_RNN
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.