Greedy Attack and Gumbel Attack: Generating Adversarial Examples for   Discrete Data

Puyudi Yang; Jianbo Chen; Cho-Jui Hsieh; Jane-Ling Wang; Michael I.; Jordan

arXiv:1805.12316·cs.LG·June 1, 2018·29 cites

Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

Puyudi Yang, Jianbo Chen, Cho-Jui Hsieh, Jane-Ling Wang, Michael I., Jordan

PDF

Open Access

TL;DR

This paper introduces probabilistic frameworks and two novel methods, Greedy Attack and Gumbel Attack, for generating adversarial examples on discrete data, demonstrating significant effectiveness on text classification models.

Contribution

It proposes a new probabilistic framework and two scalable attack methods for discrete data, with extensive evaluation on various text classification models.

Findings

01

Character-based CNN accuracy drops to random level with five character modifications

02

Methods outperform baseline attacks in effectiveness

03

Human evaluation confirms attack success

Abstract

We present a probabilistic framework for studying adversarial attacks on discrete data. Based on this framework, we derive a perturbation-based method, Greedy Attack, and a scalable learning-based method, Gumbel Attack, that illustrate various tradeoffs in the design of attacks. We demonstrate the effectiveness of these methods using both quantitative metrics and human evaluation on various state-of-the-art models for text classification, including a word-based CNN, a character-based CNN and an LSTM. As as example of our results, we show that the accuracy of character-based convolutional networks drops to the level of random selection by modifying only five characters through Greedy Attack.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Terrorism, Counterterrorism, and Political Violence

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory