Bridging the Gap between Training and Inference for Neural Machine Translation
Wen Zhang, Yang Feng, Fandong Meng, Di You, Qun Liu

TL;DR
This paper proposes a training method for Neural Machine Translation that samples context words from both ground truth and predicted sequences, reducing error accumulation and overcorrection, leading to improved translation quality.
Contribution
It introduces a novel training approach that samples context from predicted sequences during training, bridging the gap between training and inference in NMT.
Findings
Significant improvements on Chinese->English translation tasks.
Enhanced translation accuracy on WMT'14 English->German.
Effective reduction of error propagation during sequence generation.
Abstract
Neural Machine Translation (NMT) generates target words sequentially in the way of predicting the next word conditioned on the context words. At training time, it predicts with the ground truth words as context while at inference it has to generate the entire sequence from scratch. This discrepancy of the fed context leads to error accumulation among the way. Furthermore, word-level training requires strict matching between the generated sequence and the ground truth sequence which leads to overcorrection over different but reasonable translations. In this paper, we address these issues by sampling context words not only from the ground truth sequence but also from the predicted sequence by the model during training, where the predicted sequence is selected with a sentence-level optimum. Experiment results on Chinese->English and WMT'14 English->German translation tasks demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
