Curriculum Learning and Minibatch Bucketing in Neural Machine   Translation

Tom Kocmi; Ondrej Bojar

arXiv:1707.09533·cs.CL·July 9, 2020

Curriculum Learning and Minibatch Bucketing in Neural Machine Translation

Tom Kocmi, Ondrej Bojar

PDF

TL;DR

This paper investigates how sentence ordering strategies, including minibatch homogeneity and curriculum learning, affect neural machine translation training, finding that curricula can yield small improvements.

Contribution

It introduces and evaluates curriculum learning and minibatch bucketing techniques in NMT training, showing their impact on model performance.

Findings

01

Curriculum learning can improve translation quality.

02

Minibatch homogeneity has little effect on training.

03

Some curricula outperform baseline models.

Abstract

We examine the effects of particular orderings of sentence pairs on the on-line training of neural machine translation (NMT). We focus on two types of such orderings: (1) ensuring that each minibatch contains sentences similar in some aspect and (2) gradual inclusion of some sentence types as the training progresses (so called "curriculum learning"). In our English-to-Czech experiments, the internal homogeneity of minibatches has no effect on the training but some of our "curricula" achieve a small improvement over the baseline.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.