Cross-Entropy Attacks to Language Models via Rare Event Simulation

Mingze Ni; Yongshun Gong; Wei Liu

arXiv:2501.11852·cs.CL·January 22, 2025

Cross-Entropy Attacks to Language Models via Rare Event Simulation

Mingze Ni, Yongshun Gong, Wei Liu

PDF

Open Access 1 Repo

TL;DR

This paper presents Cross-Entropy Attacks (CEA), a novel black-box adversarial attack method for language models that improves attack success, imperceptibility, and sentence quality by using CE optimization for word replacements.

Contribution

The paper introduces CEA, a versatile and efficient cross-entropy based approach for black-box textual adversarial attacks, addressing limitations of previous methods.

Findings

01

CEA outperforms existing attacks in success rate

02

CEA maintains high semantic integrity and sentence quality

03

CEA is effective across document classification and translation tasks

Abstract

Black-box textual adversarial attacks are challenging due to the lack of model information and the discrete, non-differentiable nature of text. Existing methods often lack versatility for attacking different models, suffer from limited attacking performance due to the inefficient optimization with word saliency ranking, and frequently sacrifice semantic integrity to achieve better attack outcomes. This paper introduces a novel approach to textual adversarial attacks, which we call Cross-Entropy Attacks (CEA), that uses Cross-Entropy optimization to address the above issues. Our CEA approach defines adversarial objectives for both soft-label and hard-label settings and employs CE optimization to identify optimal replacements. Through extensive experiments on document classification and language translation problems, we demonstrate that our attack method excels in terms of attacking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mingzelucasni/ce-attack
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiation Effects in Electronics