TASO: Jailbreak LLMs via Alternative Template and Suffix Optimization
Yanting Wang, Runpeng Geng, Jinghui Chen, Minhao Cheng, Jinyuan Jia

TL;DR
TASO introduces a novel method for jailbreaking large language models by alternately optimizing templates and suffixes, leveraging their complementary effects to improve attack success across multiple models and benchmarks.
Contribution
The paper proposes TASO, a new jailbreak technique that combines template and suffix optimization to enhance attack effectiveness on various LLMs.
Findings
TASO significantly improves jailbreak success rates.
Effective across 24 leading LLMs and benchmark datasets.
Outperforms existing jailbreak methods.
Abstract
Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate an output for a harmful question. In general, existing jailbreak techniques either optimize a semantic template intended to induce the LLM to produce harmful outputs or optimize a suffix that leads the LLM to initiate its response with specific tokens (e.g., "Sure"). In this work, we introduce TASO (Template and Suffix Optimization), a novel jailbreak method that optimizes both a template and a suffix in an alternating manner. Our insight is that suffix optimization and template optimization are complementary to each other: suffix optimization can effectively control the first few output tokens but cannot control the overall quality of the output, while template optimization provides guidance for the entire output but cannot effectively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Network Security and Intrusion Detection
