SemAttack: Natural Textual Attacks via Different Semantic Spaces
Boxin Wang, Chejian Xu, Xiangyu Liu, Yu Cheng, Bo Li

TL;DR
SemAttack is a novel framework that efficiently generates natural adversarial texts by leveraging various semantic spaces, revealing vulnerabilities in large language models across languages while maintaining text naturalness.
Contribution
This paper introduces SemAttack, a new method for creating natural adversarial texts using multiple semantic perturbation functions, improving attack success rates and efficiency.
Findings
State-of-the-art LMs are vulnerable to SemAttack.
SemAttack works across multiple languages with high success.
Generated adversarial texts are natural and human-inconspicuous.
Abstract
Recent studies show that pre-trained language models (LMs) are vulnerable to textual adversarial attacks. However, existing attack methods either suffer from low attack success rates or fail to search efficiently in the exponentially large perturbation space. We propose an efficient and effective framework SemAttack to generate natural adversarial text by constructing different semantic perturbation functions. In particular, SemAttack optimizes the generated perturbations constrained on generic semantic spaces, including typo space, knowledge space (e.g., WordNet), contextualized semantic space (e.g., the embedding space of BERT clusterings), or the combination of these spaces. Thus, the generated adversarial texts are more semantically close to the original inputs. Extensive experiments reveal that state-of-the-art (SOTA) large-scale LMs (e.g., DeBERTa-v2) and defense strategies (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Adversarial Robustness in Machine Learning
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Softmax · Weight Decay · Adam · Attention Dropout · Dense Connections · Dropout · Linear Warmup With Linear Decay
