Contextualized Perturbation for Textual Adversarial Attack

Dianqi Li; Yizhe Zhang; Hao Peng; Liqun Chen; Chris Brockett,; Ming-Ting Sun; Bill Dolan

arXiv:2009.07502·cs.CL·March 16, 2021·20 cites

Contextualized Perturbation for Textual Adversarial Attack

Dianqi Li, Yizhe Zhang, Hao Peng, Liqun Chen, Chris Brockett,, Ming-Ting Sun, Bill Dolan

PDF

Open Access 1 Repo

TL;DR

CLARE is a context-aware adversarial text generation model that creates fluent, grammatical examples to effectively evaluate NLP model robustness, outperforming previous methods in success rate and linguistic quality.

Contribution

This paper introduces CLARE, a novel context-aware masked language model-based approach for generating natural adversarial examples with varied perturbations and improved attack efficiency.

Findings

01

CLARE achieves higher attack success rates than baselines.

02

Generated adversarial examples are more fluent and grammatical.

03

Fewer edits are needed for successful attacks.

Abstract

Adversarial examples expose the vulnerabilities of natural language processing (NLP) models, and can be used to evaluate and improve their robustness. Existing techniques of generating such examples are typically driven by local heuristic rules that are agnostic to the context, often resulting in unnatural and ungrammatical outputs. This paper presents CLARE, a ContextuaLized AdversaRial Example generation model that produces fluent and grammatical outputs through a mask-then-infill procedure. CLARE builds on a pre-trained masked language model and modifies the inputs in a context-aware manner. We propose three contextualized perturbations, Replace, Insert and Merge, allowing for generating outputs of varied lengths. With a richer range of available strategies, CLARE is able to attack a victim model more efficiently with fewer edits. Extensive experiments and human evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cookielee77/CLARE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Malware Detection Techniques