Great, Now Write an Article About That: The Crescendo Multi-Turn LLM   Jailbreak Attack

Mark Russinovich; Ahmed Salem; Ronen Eldan

arXiv:2404.01833·cs.CR·February 27, 2025·2 cites

Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack

Mark Russinovich, Ahmed Salem, Ronen Eldan

PDF

Open Access

TL;DR

This paper introduces Crescendo, a multi-turn jailbreak attack on LLMs that gradually escalates prompts to bypass alignment, demonstrating high success rates across various models and tasks, and includes an automated tool called Crescendomation.

Contribution

The paper presents Crescendo, a novel multi-turn jailbreak method that effectively bypasses model alignments, along with Crescendomation, an automation tool that outperforms existing techniques.

Findings

01

Crescendo achieves high success rates across multiple LLMs.

02

Crescendomation outperforms other jailbreak tools on AdvBench.

03

Crescendo can jailbreak multimodal models.

Abstract

Large Language Models (LLMs) have risen significantly in popularity and are increasingly being adopted across multiple applications. These LLMs are heavily aligned to resist engaging in illegal or unethical topics as a means to avoid contributing to responsible AI harms. However, a recent line of attacks, known as jailbreaks, seek to overcome this alignment. Intuitively, jailbreak attacks aim to narrow the gap between what the model can do and what it is willing to do. In this paper, we introduce a novel jailbreak attack called Crescendo. Unlike existing jailbreak methods, Crescendo is a simple multi-turn jailbreak that interacts with the model in a seemingly benign manner. It begins with a general prompt or question about the task at hand and then gradually escalates the dialogue by referencing the model's replies progressively leading to a successful jailbreak. We evaluate Crescendo…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLegal Systems and Judicial Processes · Criminal Law and Evidence · Law, AI, and Intellectual Property

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Dropout · Dense Connections