Joint Training for Neural Machine Translation Models with Monolingual Data
Zhirui Zhang, Shujie Liu, Mu Li, Ming Zhou, Enhong Chen

TL;DR
This paper introduces a joint EM training approach for NMT models that leverages monolingual data by iteratively improving source-to-target and target-to-source models, leading to significant translation quality gains.
Contribution
It proposes a novel joint training method for NMT that effectively utilizes monolingual data through iterative pseudo-data generation and model refinement.
Findings
Improves translation quality on Chinese-English and English-German tasks.
Outperforms baseline systems with monolingual data including back-translation.
Enhances both source-to-target and target-to-source models simultaneously.
Abstract
Monolingual data have been demonstrated to be helpful in improving translation quality of both statistical machine translation (SMT) systems and neural machine translation (NMT) systems, especially in resource-poor or domain adaptation tasks where parallel data are not rich enough. In this paper, we propose a novel approach to better leveraging monolingual data for neural machine translation by jointly learning source-to-target and target-to-source NMT models for a language pair with a joint EM optimization method. The training process starts with two initial NMT models pre-trained on parallel data for each direction, and these two models are iteratively updated by incrementally decreasing translation losses on training data. In each iteration step, both NMT models are first used to translate monolingual data from one language to the other, forming pseudo-training data of the other NMT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
