Dynamic Curriculum Learning for Low-Resource Neural Machine Translation
Chen Xu, Bojie Hu, Yufan Jiang, Kai Feng, Zeyang Wang, Shen Huang, Qi, Ju, Tong Xiao, Jingbo Zhu

TL;DR
This paper introduces a dynamic curriculum learning approach for low-resource neural machine translation, which adaptively reorders training data based on model progress, leading to improved translation performance.
Contribution
It proposes a novel dynamic curriculum learning method that reorders training samples based on loss decline and model competence, unlike static approaches.
Findings
DCL outperforms strong baselines on three low-resource translation benchmarks.
The method improves translation quality across different data sizes.
Dynamic reordering accelerates training and enhances model learning.
Abstract
Large amounts of data has made neural machine translation (NMT) a big success in recent years. But it is still a challenge if we train these models on small-scale corpora. In this case, the way of using data appears to be more important. Here, we investigate the effective use of training data for low-resource NMT. In particular, we propose a dynamic curriculum learning (DCL) method to reorder training samples in training. Unlike previous work, we do not use a static scoring function for reordering. Instead, the order of training samples is dynamically determined in two ways - loss decline and model competence. This eases training by highlighting easy samples that the current model has enough competence to learn. We test our DCL method in a Transformer-based system. Experimental results show that DCL outperforms several strong baselines on three low-resource machine translation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
