Robust Neural Machine Translation with Doubly Adversarial Inputs
Yong Cheng, Lu Jiang, and Wolfgang Macherey

TL;DR
This paper introduces a doubly adversarial training method to enhance neural machine translation robustness by attacking with adversarial source inputs and defending with adversarial target inputs, leading to improved performance on clean and noisy data.
Contribution
It presents a novel gradient-based adversarial input generation technique and a dual adversarial training framework for NMT robustness enhancement.
Findings
Achieved 2.8 BLEU point improvement on Chinese-English translation.
Achieved 1.6 BLEU point improvement on English-German translation.
Demonstrated higher robustness on noisy data.
Abstract
Neural machine translation (NMT) often suffers from the vulnerability to noisy perturbations in the input. We propose an approach to improving the robustness of NMT models, which consists of two parts: (1) attack the translation model with adversarial source examples; (2) defend the translation model with adversarial target inputs to improve its robustness against the adversarial source inputs.For the generation of adversarial inputs, we propose a gradient-based method to craft adversarial examples informed by the translation loss over the clean inputs.Experimental results on Chinese-English and English-German translation tasks demonstrate that our approach achieves significant improvements ( and BLEU points) over Transformer on standard clean benchmarks as well as exhibiting higher robustness on noisy data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Nuclear Materials and Properties · Topic Modeling
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
