LIAR: Leveraging Inference Time Alignment (Best-of-N) to Jailbreak LLMs in Seconds
James Beetham, Souradip Chakraborty, Mengdi Wang, Furong Huang, Amrit Singh Bedi, Mubarak Shah

TL;DR
This paper introduces LIAR, a fast black-box attack method that exploits inference-time misalignment to efficiently jailbreak safety-aligned LLMs, significantly reducing attack time and complexity while maintaining high success rates.
Contribution
The paper presents LIAR, a novel inference-time sampling attack that is faster and more practical than existing methods, along with a new metric for measuring safety alignment strength.
Findings
LIAR achieves state-of-the-art success rates in jailbreak attacks.
Reduces attack perplexity by 10 times and time-to-attack from hours to seconds.
Provides a theoretical framework for quantifying safety alignment robustness.
Abstract
Jailbreak attacks expose vulnerabilities in safety-aligned LLMs by eliciting harmful outputs through carefully crafted prompts. Existing methods rely on discrete optimization or trained adversarial generators, but are slow, compute-intensive, and often impractical. We argue that these inefficiencies stem from a mischaracterization of the problem. Instead, we frame jailbreaks as inference-time misalignment and introduce LIAR (Leveraging Inference-time misAlignment to jailbReak), a fast, black-box, best-of- sampling attack requiring no training. LIAR matches state-of-the-art success rates while reducing perplexity by and Time-to-Attack from hours to seconds. We also introduce a theoretical "safety net against jailbreaks" metric to quantify safety alignment strength and derive suboptimality bounds. Our work offers a simple yet effective tool for evaluating LLM robustness and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law
