Effective and Imperceptible Adversarial Textual Attack via   Multi-objectivization

Shengcai Liu; Ning Lu; Wenjing Hong; Chao Qian; Ke Tang

arXiv:2111.01528·cs.CL·December 18, 2023·5 cites

Effective and Imperceptible Adversarial Textual Attack via Multi-objectivization

Shengcai Liu, Ning Lu, Wenjing Hong, Chao Qian, Ke Tang

PDF

Open Access 1 Repo

TL;DR

This paper introduces HydraText, a multi-objective evolutionary algorithm that crafts adversarial text attacks balancing success and imperceptibility, outperforming existing methods in effectiveness and subtlety.

Contribution

It reformulates adversarial text attack as a multi-objective optimization problem and proposes HydraText, the first approach effective in both score-based and decision-based attack settings.

Findings

01

HydraText achieves high attack success rates.

02

AEs are more indistinguishable from human text.

03

Adversarial training with HydraText improves model robustness.

Abstract

The field of adversarial textual attack has significantly grown over the last few years, where the commonly considered objective is to craft adversarial examples (AEs) that can successfully fool the target model. However, the imperceptibility of attacks, which is also essential for practical attackers, is often left out by previous studies. In consequence, the crafted AEs tend to have obvious structural and semantic differences from the original human-written text, making them easily perceptible. In this work, we advocate leveraging multi-objectivization to address such issue. Specifically, we reformulate the problem of crafting AEs as a multi-objective optimization problem, where the attack imperceptibility is considered as an auxiliary objective. Then, we propose a simple yet effective evolutionary algorithm, dubbed HydraText, to solve this problem. To the best of our knowledge,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

colinlu50/hydratext
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Hate Speech and Cyberbullying Detection