Alternated Training with Synthetic and Authentic Data for Neural Machine Translation
Rui Jiao, Zonghan Yang, Maosong Sun, Yang Liu

TL;DR
This paper introduces an alternated training method using synthetic and authentic data for neural machine translation, which improves translation quality by guiding models away from noise and towards higher BLEU scores.
Contribution
It proposes a novel alternated training approach that leverages authentic data as guidance to enhance NMT performance with synthetic data.
Findings
Improved translation performance on Chinese-English and German-English tasks.
Authentic data guides model parameters towards higher BLEU scores.
Visualization shows authentic data helps stabilize training and improve quality.
Abstract
While synthetic bilingual corpora have demonstrated their effectiveness in low-resource neural machine translation (NMT), adding more synthetic data often deteriorates translation performance. In this work, we propose alternated training with synthetic and authentic data for NMT. The basic idea is to alternate synthetic and authentic corpora iteratively during training. Compared with previous work, we introduce authentic data as guidance to prevent the training of NMT models from being disturbed by noisy synthetic data. Experiments on Chinese-English and German-English translation tasks show that our approach improves the performance over several strong baselines. We visualize the BLEU landscape to further investigate the role of authentic and synthetic data during alternated training. From the visualization, we find that authentic data helps to direct the NMT model parameters towards…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
