Semantic-Preserving Adversarial Text Attacks

Xinghao Yang; Weifeng Liu; James Bailey; Dacheng Tao; Wei Liu

arXiv:2108.10015·cs.CL·March 6, 2023

Semantic-Preserving Adversarial Text Attacks

Xinghao Yang, Weifeng Liu, James Bailey, Dacheng Tao, Wei Liu

PDF

Open Access 2 Repos

TL;DR

This paper introduces BU-SPO, a novel adversarial attack method on text classifiers that uses bigram and unigram substitutions with semantic preservation techniques to efficiently induce misclassification while maintaining semantic integrity.

Contribution

It proposes a new hybrid attack method that leverages bigram and unigram substitutions along with semantic optimization to improve attack success and semantic preservation.

Findings

01

Achieves higher attack success rates than existing methods.

02

Maintains high semantic similarity in adversarial examples.

03

Uses fewer word modifications to induce misclassification.

Abstract

Deep neural networks (DNNs) are known to be vulnerable to adversarial images, while their robustness in text classification is rarely studied. Several lines of text attack methods have been proposed in the literature, including character-level, word-level, and sentence-level attacks. However, it is still a challenge to minimize the number of word changes necessary to induce misclassification, while simultaneously ensuring lexical correctness, syntactic soundness, and semantic similarity. In this paper, we propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models. Our method has four major merits. Firstly, we propose to attack text documents not only at the unigram word level but also at the bigram level which better keeps semantics and avoids producing meaningless outputs. Secondly, we propose a hybrid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning