TF-Attack: Transferable and Fast Adversarial Attacks on Large Language Models
Zelin Li, Kehai Chen, Lemao Liu, Xuefeng Bai, Mingming Yang, Yang, Xiang, Min Zhang

TL;DR
TF-Attack introduces a novel method for adversarial attacks on large language models that significantly improves transferability and speed by using an external overseer and parallel attack strategies.
Contribution
The paper proposes TF-Attack, a new scheme that enhances transferability and efficiency of adversarial attacks on LLMs by leveraging an external overseer and parallel processing.
Findings
Outperforms previous methods in transferability across models.
Achieves up to 20 times faster attack speeds.
Consistently surpasses prior approaches on multiple benchmarks.
Abstract
With the great advancements in large language models (LLMs), adversarial attacks against LLMs have recently attracted increasing attention. We found that pre-existing adversarial attack methodologies exhibit limited transferability and are notably inefficient, particularly when applied to LLMs. In this paper, we analyze the core mechanisms of previous predominant adversarial attack methods, revealing that 1) the distributions of importance score differ markedly among victim models, restricting the transferability; 2) the sequential attack processes induces substantial time overheads. Based on the above two insights, we introduce a new scheme, named TF-Attack, for Transferable and Fast adversarial attacks on LLMs. TF-Attack employs an external LLM as a third-party overseer rather than the victim model to identify critical units within sentences. Moreover, TF-Attack introduces the concept…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Natural Language Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
