SemAttack: Natural Textual Attacks via Different Semantic Spaces

Boxin Wang; Chejian Xu; Xiangyu Liu; Yu Cheng; Bo Li

arXiv:2205.01287·cs.CL·June 14, 2022

SemAttack: Natural Textual Attacks via Different Semantic Spaces

Boxin Wang, Chejian Xu, Xiangyu Liu, Yu Cheng, Bo Li

PDF

Open Access 1 Repo

TL;DR

SemAttack is a novel framework that efficiently generates natural adversarial texts by leveraging various semantic spaces, revealing vulnerabilities in large language models across languages while maintaining text naturalness.

Contribution

This paper introduces SemAttack, a new method for creating natural adversarial texts using multiple semantic perturbation functions, improving attack success rates and efficiency.

Findings

01

State-of-the-art LMs are vulnerable to SemAttack.

02

SemAttack works across multiple languages with high success.

03

Generated adversarial texts are natural and human-inconspicuous.

Abstract

Recent studies show that pre-trained language models (LMs) are vulnerable to textual adversarial attacks. However, existing attack methods either suffer from low attack success rates or fail to search efficiently in the exponentially large perturbation space. We propose an efficient and effective framework SemAttack to generate natural adversarial text by constructing different semantic perturbation functions. In particular, SemAttack optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e.g., WordNet), contextualized semantic space (e.g., the embedding space of BERT clusterings), or the combination of these spaces. Thus, the generated adversarial texts are more semantically close to the original inputs. Extensive experiments reveal that state-of-the-art (SOTA) large-scale LMs (e.g., DeBERTa-v2) and defense strategies (e.g.,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ai-secure/semattack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Softmax · Weight Decay · Adam · Attention Dropout · Dense Connections · Dropout · Linear Warmup With Linear Decay