AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
Yong Cheng, Lu Jiang, Wolfgang Macherey, Jacob Eisenstein

TL;DR
AdvAug introduces a novel adversarial augmentation technique for neural machine translation that enhances model robustness and translation quality by utilizing virtual sentences in a smooth interpolated embedding space, outperforming existing methods.
Contribution
The paper presents a new adversarial augmentation method, AdvAug, which leverages a novel vicinity distribution for virtual sentences to improve NMT performance without extra data.
Findings
AdvAug improves BLEU scores by up to 4.9 points on multiple benchmarks.
It outperforms traditional data augmentation techniques like back-translation.
The method enhances model robustness and translation quality.
Abstract
In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT). The main idea is to minimize the vicinal risk over virtual sentences sampled from two vicinity distributions, of which the crucial one is a novel vicinity distribution for adversarial sentences that describes a smooth interpolated embedding space centered around observed training sentence pairs. We then discuss our approach, AdvAug, to train NMT models using the embeddings of virtual sentences in sequence-to-sequence learning. Experiments on Chinese-English, English-French, and English-German translation benchmarks show that AdvAug achieves significant improvements over the Transformer (up to 4.9 BLEU points), and substantially outperforms other data augmentation techniques (e.g. back-translation) without using extra corpora.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Multi-Head Attention · Adam · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Byte Pair Encoding
