BreakFun: Jailbreaking LLMs via Schema Exploitation

Amirkia Rafiei Oskooei; Mehmet S. Aktas

arXiv:2510.17904·cs.CR·December 16, 2025

BreakFun: Jailbreaking LLMs via Schema Exploitation

Amirkia Rafiei Oskooei, Mehmet S. Aktas

PDF

Open Access

TL;DR

This paper introduces BreakFun, a schema-based jailbreak method exploiting LLMs' adherence to structured data, demonstrating high success rates and proposing a guardrail defense to mitigate such vulnerabilities.

Contribution

We present a novel schema exploitation attack on LLMs and a defense mechanism using adversarial prompt deconstruction to enhance model robustness.

Findings

01

BreakFun achieves up to 100% success rate on several models.

02

The Trojan Schema is identified as the primary causal factor.

03

The proposed guardrail effectively detects and mitigates schema-based attacks.

Abstract

The proficiency of Large Language Models (LLMs) in processing structured data and adhering to syntactic rules is a capability that drives their widespread adoption but also makes them paradoxically vulnerable. In this paper, we investigate this vulnerability through BreakFun, a jailbreak methodology that weaponizes an LLM's adherence to structured schemas. BreakFun employs a three-part prompt that combines an innocent framing and a Chain-of-Thought distraction with a core "Trojan Schema"--a carefully crafted data structure that compels the model to generate harmful content, exploiting the LLM's strong tendency to follow structures and schemas. We demonstrate this vulnerability is highly transferable, achieving an average success rate of 89% across 13 foundational and proprietary models on JailbreakBench, and reaching a 100% Attack Success Rate (ASR) on several prominent models. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Topic Modeling · Advanced Malware Detection Techniques