An Empirical Exploration of Curriculum Learning for Neural Machine   Translation

Xuan Zhang; Gaurav Kumar; Huda Khayrallah; Kenton Murray; Jeremy; Gwinnup; Marianna J Martindale; Paul McNamee; Kevin Duh; Marine Carpuat

arXiv:1811.00739·cs.CL·November 5, 2018·106 cites

An Empirical Exploration of Curriculum Learning for Neural Machine Translation

Xuan Zhang, Gaurav Kumar, Huda Khayrallah, Kenton Murray, Jeremy, Gwinnup, Marianna J Martindale, Paul McNamee, Kevin Duh, Marine Carpuat

PDF

Open Access 1 Repo

TL;DR

This paper investigates how curriculum learning can improve the training efficiency of neural machine translation models by exploring different sample presentation strategies, showing potential for faster convergence without sacrificing translation quality.

Contribution

It provides an extensive empirical analysis of curriculum learning in neural machine translation, highlighting the importance of curriculum design choices.

Findings

01

Curriculum learning can speed up training convergence.

02

Sample difficulty criteria significantly affect results.

03

Proper hyperparameter tuning is crucial for success.

Abstract

Machine translation systems based on deep neural networks are expensive to train. Curriculum learning aims to address this issue by choosing the order in which samples are presented during training to help train better models faster. We adopt a probabilistic view of curriculum learning, which lets us flexibly evaluate the impact of curricula design, and perform an extensive exploration on a German-English translation task. Results show that it is possible to improve convergence time at no loss in translation quality. However, results are highly sensitive to the choice of sample difficulty criteria, curriculum schedule and other hyperparameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

awslabs/sockeye
mxnetOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification