FastWordBug: A Fast Method To Generate Adversarial Text Against NLP   Applications

Dou Goodman; Lv Zhonghou; Wang minghua

arXiv:2002.00760·cs.CL·February 4, 2020·5 cites

FastWordBug: A Fast Method To Generate Adversarial Text Against NLP Applications

Dou Goodman, Lv Zhonghou, Wang minghua

PDF

Open Access

TL;DR

FastWordBug is an efficient black-box adversarial attack algorithm that quickly identifies and perturbs key words in text to mislead NLP models with minimal queries.

Contribution

The paper introduces FastWordBug, a novel scoring method for rapid identification of impactful words to generate adversarial text in black-box settings.

Findings

01

Significantly reduces NLP model accuracy

02

Requires minimal model queries

03

Effective against real-world cloud NLP services

Abstract

In this paper, we present a novel algorithm, FastWordBug, to efficiently generate small text perturbations in a black-box setting that forces a sentiment analysis or text classification mode to make an incorrect prediction. By combining the part of speech attributes of words, we propose a scoring method that can quickly identify important words that affect text classification. We evaluate FastWordBug on three real-world text datasets and two state-of-the-art machine learning models under black-box setting. The results show that our method can significantly reduce the accuracy of the model, and at the same time, we can call the model as little as possible, with the highest attack efficiency. We also attack two popular real-world cloud services of NLP, and the results show that our method works as well.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Malware Detection Techniques