Black-box Generation of Adversarial Text Sequences to Evade Deep   Learning Classifiers

Ji Gao; Jack Lanchantin; Mary Lou Soffa; Yanjun Qi

arXiv:1801.04354·cs.CL·May 24, 2018·57 cites

Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers

Ji Gao, Jack Lanchantin, Mary Lou Soffa, Yanjun Qi

PDF

Open Access 2 Repos

TL;DR

This paper introduces DeepWordBug, a novel black-box attack algorithm that generates minimal text perturbations to deceive deep learning classifiers across various text datasets.

Contribution

The paper presents a new black-box adversarial attack method, DeepWordBug, which effectively identifies critical tokens and applies minimal character-level changes to fool classifiers.

Findings

01

DeepWordBug significantly reduces classifier accuracy.

02

The attack achieves up to 68% accuracy decrease on Word-LSTM.

03

It outperforms baseline methods in black-box settings.

Abstract

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to black-box attacks, which are more realistic scenarios. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We employ novel scoring strategies to identify the critical tokens that, if modified, cause the classifier to make an incorrect prediction. Simple character-level transformations are applied to the highest-ranked tokens in order to minimize the edit distance of the perturbation, yet change the original classification. We evaluated DeepWordBug on eight real-world text datasets, including text classification, sentiment analysis, and spam detection. We compare the result of DeepWordBug with two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Topic Modeling