SQL Injection Jailbreak: A Structural Disaster of Large Language Models

Jiawei Zhao; Kejiang Chen; Weiming Zhang; Nenghai Yu

arXiv:2411.01565·cs.CR·May 22, 2025

SQL Injection Jailbreak: A Structural Disaster of Large Language Models

Jiawei Zhao, Kejiang Chen, Weiming Zhang, Nenghai Yu

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces SQL Injection Jailbreak (SIJ), a novel method exploiting prompt construction vulnerabilities in large language models to induce harmful outputs, revealing a new security weakness and proposing an effective defense.

Contribution

The paper presents SIJ, a new prompt-based jailbreak technique for LLMs, and proposes a simple adaptive defense method called Self-Reminder-Key.

Findings

01

Near 100% success rate on open-source models

02

Over 85% success rate on closed-source models

03

Effective defense demonstrated with Self-Reminder-Key

Abstract

Large Language Models (LLMs) are susceptible to jailbreak attacks that can induce them to generate harmful content. Previous jailbreak methods primarily exploited the internal properties or capabilities of LLMs, such as optimization-based jailbreak methods and methods that leveraged the model's context-learning abilities. In this paper, we introduce a novel jailbreak method, SQL Injection Jailbreak (SIJ), which targets the external properties of LLMs, specifically, the way LLMs construct input prompts. By injecting jailbreak information into user prompts, SIJ successfully induces the model to output harmful content. For open-source models, SIJ achieves near 100% attack success rates on five well-known LLMs on the AdvBench and HEx-PHI, while incurring lower time costs compared to previous methods. For closed-source models, SIJ achieves an average attack success rate over 85% across five…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

weiyezhimeng/sql-injection-jailbreak
pytorchOfficial

Datasets

weiyezhimeng/SQL_Jailbreak_result
dataset· 7 dl
7 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital and Cyber Forensics