BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Linyang Li; Ruotian Ma; Qipeng Guo; Xiangyang Xue; Xipeng Qiu

arXiv:2004.09984·cs.CL·October 5, 2020·68 cites

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Linyang Li, Ruotian Ma, Qipeng Guo, Xiangyang Xue, Xipeng Qiu

PDF

Open Access 4 Repos

TL;DR

BERT-Attack is a novel adversarial attack method that uses pre-trained BERT models to generate fluent, semantically consistent adversarial texts, outperforming existing methods in success rate and efficiency.

Contribution

The paper introduces BERT-Attack, a new approach leveraging BERT for effective, high-quality adversarial text generation with low computational cost.

Findings

01

Outperforms state-of-the-art attack strategies in success rate

02

Generates fluent and semantically preserved adversarial samples

03

Achieves low computational cost suitable for large-scale use

Abstract

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods. Current successful attack methods for texts usually adopt heuristic replacement strategies on the character or word level, which remains challenging to find the optimal solution in the massive space of possible combinations of replacements while preserving semantic consistency and language fluency. In this paper, we propose \textbf{BERT-Attack}, a high-quality and effective method to generate adversarial samples using pre-trained masked language models exemplified by BERT. We turn BERT against its fine-tuned models and other deep neural models in downstream tasks so that we can successfully mislead the target models to predict incorrectly. Our method outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Anomaly Detection Techniques and Applications

MethodsLinear Layer · Residual Connection · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Adam · WordPiece · Softmax