Lost In Translation: Generating Adversarial Examples Robust to   Round-Trip Translation

Neel Bhandari; Pin-Yu Chen

arXiv:2307.12520·cs.CL·July 25, 2023

Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation

Neel Bhandari, Pin-Yu Chen

PDF

1 Repo

TL;DR

This paper investigates how adversarial text examples are affected by round-trip translation and proposes using machine translation during generation to create more translation-robust adversarial attacks.

Contribution

It reveals that existing adversarial attacks lose effectiveness after translation and introduces a method integrating machine translation to improve robustness.

Findings

01

Existing attacks do not withstand round-trip translation.

02

Integrating machine translation enhances adversarial robustness.

03

Multilingual adversarial attack research is motivated.

Abstract

Language Models today provide a high accuracy across a large number of downstream tasks. However, they remain susceptible to adversarial attacks, particularly against those where the adversarial examples maintain considerable similarity to the original text. Given the multilingual nature of text, the effectiveness of adversarial examples across translations and how machine translations can improve the robustness of adversarial examples remain largely unexplored. In this paper, we present a comprehensive study on the robustness of current text adversarial attacks to round-trip translation. We demonstrate that 6 state-of-the-art text-based adversarial attacks do not maintain their efficacy after round-trip translation. Furthermore, we introduce an intervention-based solution to this problem, by integrating Machine Translation into the process of adversarial example generation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neelbhandari6/nmt_text_attack
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.