Doubly-Trained Adversarial Data Augmentation for Neural Machine   Translation

Weiting Tan; Shuoyang Ding; Huda Khayrallah; Philipp Koehn

arXiv:2110.05691·cs.CL·October 13, 2021

Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation

Weiting Tan, Shuoyang Ding, Huda Khayrallah, Philipp Koehn

PDF

Open Access 1 Repo

TL;DR

This paper introduces a doubly-trained adversarial data augmentation method for neural machine translation that enhances model robustness by generating semantically consistent adversarial samples using a joint loss with two translation models.

Contribution

It proposes a novel doubly-trained architecture with a joint loss function to generate adversarial samples that improve NMT robustness across multiple language pairs.

Findings

01

Adversarial samples improve translation robustness.

02

Method effective across different language pairs.

03

Enhanced model resilience to noisy inputs.

Abstract

Neural Machine Translation (NMT) models are known to suffer from noisy inputs. To make models robust, we generate adversarial augmentation samples that attack the model and preserve the source-side semantic meaning at the same time. To generate such samples, we propose a doubly-trained architecture that pairs two NMT models of opposite translation directions with a joint loss function, which combines the target-side attack and the source-side semantic similarity constraint. The results from our experiments across three different language pairs and two evaluation metrics show that these adversarial samples improve the model robustness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

steventan0110/NMTModelAttack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Adversarial Robustness in Machine Learning