Token-wise Curriculum Learning for Neural Machine Translation

Chen Liang; Haoming Jiang; Xiaodong Liu; Pengcheng He; Weizhu Chen,; Jianfeng Gao; Tuo Zhao

arXiv:2103.11088·cs.CL·March 23, 2021

Token-wise Curriculum Learning for Neural Machine Translation

Chen Liang, Haoming Jiang, Xiaodong Liu, Pengcheng He, Weizhu Chen,, Jianfeng Gao, Tuo Zhao

PDF

Open Access

TL;DR

This paper introduces a token-wise curriculum learning method for neural machine translation that improves training efficiency and translation quality, especially for low-resource languages, by gradually expanding target subsequences during training.

Contribution

The paper proposes a novel token-wise curriculum learning approach that creates sufficient easy samples and adapts to low-resource scenarios, outperforming existing methods.

Findings

01

Outperforms baselines on 5 language pairs.

02

Especially effective for low-resource languages.

03

Combining with sentence-level methods further improves results.

Abstract

Existing curriculum learning approaches to Neural Machine Translation (NMT) require sampling sufficient amounts of "easy" samples from training data at the early training stage. This is not always achievable for low-resource languages where the amount of training data is limited. To address such limitation, we propose a novel token-wise curriculum learning approach that creates sufficient amounts of easy samples. Specifically, the model learns to predict a short sub-sequence from the beginning part of each target sentence at the early stage of training, and then the sub-sequence is gradually expanded as the training progresses. Such a new curriculum design is inspired by the cumulative effect of translation errors, which makes the latter tokens more difficult to predict than the beginning ones. Extensive experiments show that our approach can consistently outperform baselines on 5…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification