Rethink the Evaluation for Attack Strength of Backdoor Attacks in   Natural Language Processing

Lingfeng Shen; Haiyun Jiang; Lemao Liu; Shuming Shi

arXiv:2201.02993·cs.CL·February 17, 2022·1 cites

Rethink the Evaluation for Attack Strength of Backdoor Attacks in Natural Language Processing

Lingfeng Shen, Haiyun Jiang, Lemao Liu, Shuming Shi

PDF

Open Access

TL;DR

This paper challenges existing evaluations of stealthy backdoor attacks in NLP, proposing a new metric (ASRD) to better measure attack strength and introducing Trigger Breaker, a simple yet effective defense method.

Contribution

It introduces ASRD as a more accurate metric for attack strength and presents Trigger Breaker, a novel defense approach against stealthy backdoor attacks in NLP.

Findings

01

ASRD provides a more accurate measure of attack strength.

02

Trigger Breaker outperforms existing defenses.

03

Stealthy backdoor attack capacity is overestimated by previous metrics.

Abstract

It has been shown that natural language processing (NLP) models are vulnerable to a kind of security threat called the Backdoor Attack, which utilizes a `backdoor trigger' paradigm to mislead the models. The most threatening backdoor attack is the stealthy backdoor, which defines the triggers as text style or syntactic. Although they have achieved an incredible high attack success rate (ASR), we find that the principal factor contributing to their ASR is not the `backdoor trigger' paradigm. Thus the capacity of these stealthy backdoor attacks is overestimated when categorized as backdoor attacks. Therefore, to evaluate the real attack power of backdoor attacks, we propose a new metric called attack successful rate difference (ASRD), which measures the ASR difference between clean state and poison state models. Besides, since the defenses against stealthy backdoor attacks are absent, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research