Dagger Behind Smile: Fool LLMs with a Happy Ending Story

Xurui Song; Zhixin Xie; Shuo Huai; Jiayi Kong; Jun Luo

arXiv:2501.13115·cs.CL·October 1, 2025

Dagger Behind Smile: Fool LLMs with a Happy Ending Story

Xurui Song, Zhixin Xie, Shuo Huai, Jiayi Kong, Jun Luo

PDF

Open Access

TL;DR

This paper introduces the Happy Ending Attack (HEA), a novel prompt-based jailbreak method that exploits LLMs' responsiveness to positive prompts, achieving high success rates with minimal interactions.

Contribution

The paper proposes HEA, a new efficient and effective jailbreak technique using positive prompts, demonstrating its success across multiple state-of-the-art LLMs.

Findings

01

HEA achieves an 88.79% success rate on average.

02

HEA requires only up to two turns to succeed.

03

HEA is effective against models like GPT-4o, Llama3-70b, and Gemini-pro.

Abstract

The wide adoption of Large Language Models (LLMs) has attracted significant attention from $jailbreak$ attacks, where adversarial prompts crafted through optimization or manual design exploit LLMs to generate malicious contents. However, optimization-based attacks have limited efficiency and transferability, while existing manual designs are either easily detectable or demand intricate interactions with LLMs. In this paper, we first point out a novel perspective for jailbreak attacks: LLMs are more responsive to $positive$ prompts. Based on this, we deploy Happy Ending Attack (HEA) to wrap up a malicious request in a scenario template involving a positive prompt formed mainly via a $happy ending$ , it thus fools LLMs into jailbreaking either immediately or at a follow-up malicious request. This has made HEA both efficient and effective, as it requires only up…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games

MethodsSoftmax · Attention Is All You Need