TextHacker: Learning based Hybrid Local Search Algorithm for Text   Hard-label Adversarial Attack

Zhen Yu; Xiaosen Wang; Wanxiang Che; Kun He

arXiv:2201.08193·cs.CL·October 25, 2022·1 cites

TextHacker: Learning based Hybrid Local Search Algorithm for Text Hard-label Adversarial Attack

Zhen Yu, Xiaosen Wang, Wanxiang Che, Kun He

PDF

Open Access 1 Repo

TL;DR

TextHacker is a novel hard-label adversarial attack method that learns word importance through label changes and employs a hybrid local search to generate effective adversarial examples, outperforming existing methods.

Contribution

It introduces a new hard-label attack framework using a hybrid local search and word importance estimation, advancing the robustness testing of NLP models.

Findings

01

Significantly outperforms existing hard-label attacks

02

Effective in text classification and textual entailment tasks

03

Reduces adversarial perturbations while maintaining attack success

Abstract

Existing textual adversarial attacks usually utilize the gradient or prediction confidence to generate adversarial examples, making it hard to be deployed in real-world applications. To this end, we consider a rarely investigated but more rigorous setting, namely hard-label attack, in which the attacker can only access the prediction label. In particular, we find we can learn the importance of different words via the change on prediction label caused by word substitutions on the adversarial examples. Based on this observation, we propose a novel adversarial attack, termed Text Hard-label attacker (TextHacker). TextHacker randomly perturbs lots of words to craft an adversarial example. Then, TextHacker adopts a hybrid local search algorithm with the estimation of word importance from the attack history to minimize the adversarial perturbation. Extensive evaluations for text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jhl-hust/texthacker
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Network Security and Intrusion Detection · Spam and Phishing Detection