Uncertainty-Aware Semantic Augmentation for Neural Machine Translation
Xiangpeng Wei, Heng Yu, Yue Hu, Rongxiang Weng, Luxi Xing, and Weihua Luo

TL;DR
This paper introduces an uncertainty-aware semantic augmentation method for neural machine translation that captures semantic variations among source sentences to improve translation quality, addressing training-inference data distribution discrepancies.
Contribution
It proposes a novel semantic augmentation technique that explicitly models semantic uncertainty, enhancing NMT performance by aligning training and inference data distributions.
Findings
Significant improvements over strong baselines.
Effective handling of semantic variations in source sentences.
Enhanced translation accuracy across multiple tasks.
Abstract
As a sequence-to-sequence generation task, neural machine translation (NMT) naturally contains intrinsic uncertainty, where a single sentence in one language has multiple valid counterparts in the other. However, the dominant methods for NMT only observe one of them from the parallel corpora for the model training but have to deal with adequate variations under the same meaning at inference. This leads to a discrepancy of the data distribution between the training and the inference phases. To address this problem, we propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences and enhances the hidden representations with this information for better translations. Extensive experiments on various translation tasks reveal that our approach significantly outperforms the strong baselines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
