PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Yuzhi Liang; Shiliang Xiao; Jingsong Wei; Qiliang Lin; and Xia Li

arXiv:2603.10842·cs.CL·March 12, 2026

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Yuzhi Liang, Shiliang Xiao, Jingsong Wei, Qiliang Lin, and Xia Li

PDF

Open Access

TL;DR

PivotAttack introduces an efficient inside-out method using multi-armed bandits to identify key token groups, significantly improving hard-label text attack success rates and reducing query costs.

Contribution

It presents a novel query-efficient framework that leverages pivot sets and multi-armed bandits for more effective hard-label text attacks.

Findings

01

Outperforms state-of-the-art baselines in success rate

02

Reduces query costs significantly

03

Effective on both traditional and large language models

Abstract

Existing hard-label text attacks often rely on inefficient "outside-in" strategies that traverse vast search spaces. We propose PivotAttack, a query-efficient "inside-out" framework. It employs a Multi-Armed Bandit algorithm to identify Pivot Sets-combinatorial token groups acting as prediction anchors-and strategically perturbs them to induce label flips. This approach captures inter-word dependencies and minimizes query costs. Extensive experiments across traditional models and Large Language Models demonstrate that PivotAttack consistently outperforms state-of-the-art baselines in both Attack Success Rate and query efficiency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Adversarial Robustness in Machine Learning · Topic Modeling