Multi-granularity Textual Adversarial Attack with Behavior Cloning

Yangyi Chen; Jin Su; Wei Wei

arXiv:2109.04367·cs.CL·September 10, 2021

Multi-granularity Textual Adversarial Attack with Behavior Cloning

Yangyi Chen, Jin Su, Wei Wei

PDF

Open Access 1 Repo

TL;DR

This paper introduces MAYA, a multi-granular textual adversarial attack model that efficiently generates high-quality adversarial samples with fewer queries by combining multi-level modification strategies and reinforcement learning-based behavior cloning.

Contribution

The paper proposes a novel multi-granular attack framework and a reinforcement learning approach to train an attack agent, significantly reducing query times and improving attack quality on black-box NLP models.

Findings

01

Achieves better attack success rates and sample fluency.

02

Reduces query times substantially in black-box settings.

03

Effective against models like BiLSTM, BERT, and RoBERTa.

Abstract

Recently, the textual adversarial attack models become increasingly popular due to their successful in estimating the robustness of NLP models. However, existing works have obvious deficiencies. (1) They usually consider only a single granularity of modification strategies (e.g. word-level or sentence-level), which is insufficient to explore the holistic textual space for generation; (2) They need to query victim models hundreds of times to make a successful attack, which is highly inefficient in practice. To address such problems, in this paper we propose MAYA, a Multi-grAnularitY Attack model to effectively generate high-quality adversarial samples with fewer queries to victim models. Furthermore, we propose a reinforcement-learning based method to train a multi-granularity attack agent through behavior cloning with the expert knowledge from our MAYA algorithm to further reduce the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yangyi-chen/maya
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Malware Detection Techniques

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Tanh Activation · Sigmoid Activation · Attention Dropout · Long Short-Term Memory · WordPiece · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections