Curriculum Learning and Minibatch Bucketing in Neural Machine Translation
Tom Kocmi, Ondrej Bojar

TL;DR
This paper investigates how sentence ordering strategies, including minibatch homogeneity and curriculum learning, affect neural machine translation training, finding that curricula can yield small improvements.
Contribution
It introduces and evaluates curriculum learning and minibatch bucketing techniques in NMT training, showing their impact on model performance.
Findings
Curriculum learning can improve translation quality.
Minibatch homogeneity has little effect on training.
Some curricula outperform baseline models.
Abstract
We examine the effects of particular orderings of sentence pairs on the on-line training of neural machine translation (NMT). We focus on two types of such orderings: (1) ensuring that each minibatch contains sentences similar in some aspect and (2) gradual inclusion of some sentence types as the training progresses (so called "curriculum learning"). In our English-to-Czech experiments, the internal homogeneity of minibatches has no effect on the training but some of our "curricula" achieve a small improvement over the baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
