Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with   Adversarial Examples

Minhao Cheng; Jinfeng Yi; Pin-Yu Chen; Huan Zhang; Cho-Jui Hsieh

arXiv:1803.01128·cs.LG·April 22, 2020·70 cites

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Minhao Cheng, Jinfeng Yi, Pin-Yu Chen, Huan Zhang, Cho-Jui Hsieh

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method for generating adversarial examples for sequence-to-sequence models, demonstrating their vulnerability in tasks like translation and summarization, while highlighting their relative robustness compared to CNN classifiers.

Contribution

The paper proposes a new gradient-based attack method tailored for discrete text inputs and infinite output spaces in seq2seq models, advancing adversarial evaluation techniques.

Findings

01

Less than 3 words change can alter outputs significantly

02

Seq2seq models are more robust than CNN classifiers

03

Effective targeted and non-overlapping attacks achieved

Abstract

Crafting adversarial examples has become an important technique to evaluate the robustness of deep neural networks (DNNs). However, most existing works focus on attacking the image classification problem since its input space is continuous and output space is finite. In this paper, we study the much more challenging problem of crafting adversarial examples for sequence-to-sequence (seq2seq) models, whose inputs are discrete text strings and outputs have an almost infinite number of possibilities. To address the challenges caused by the discrete input space, we propose a projected gradient method combined with group lasso and gradient regularization. To handle the almost infinite output space, we design some novel loss functions to conduct non-overlapping attack and targeted keyword attack. We apply our algorithm to machine translation and text summarization tasks, and verify the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cmhcbb/Seq2Sick
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence