New Wide-Net-Casting Jailbreak Attacks Risk Large Models

Qiuchi Xiang; Haoxuan Qu; Hossein Rahmani; Jun Liu

arXiv:2605.17128·cs.CR·May 19, 2026

New Wide-Net-Casting Jailbreak Attacks Risk Large Models

Qiuchi Xiang, Haoxuan Qu, Hossein Rahmani, Jun Liu

PDF

TL;DR

This paper introduces the wide-net-casting jailbreak scenario, where querying multiple large models simultaneously poses significant safety risks, and presents a novel method achieving up to 100% success rate in such attacks.

Contribution

It identifies a new high-risk jailbreak scenario involving multiple models and develops a tailored attack method demonstrating its severity.

Findings

01

Jailbreak success rate reaches 100% in experiments

02

Wide-net-casting poses substantial safety risks

03

New tailored jailbreak method effectively exploits this scenario

Abstract

Jailbreak attacks on large models have drawn growing attention due to their close ties to societal safety. This work identifies a practical yet unexplored jailbreak scenario, the wide-net-casting scenario, where an adversary can query a group of large models instead of a single one to elicit harmful outputs. Our analysis reveals substantial yet previously overlooked safety risks under this scenario. As a key part of our analysis, we further develop a novel jailbreak method tailored to the wide-net-casting scenario. With this tailored method, the jailbreak success rate can even reach 100\% in some experiments when targeting the large models without additional safeguards, exposing wide-net-casting as a distinct, high-risk scenario that warrants attention in future evaluation and defense research.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.