Adversarial SQL Injection Generation with LLM-Based Architectures

Ali Karakoc; H. Birkan Yilmaz

arXiv:2605.11188·cs.CR·May 13, 2026

Adversarial SQL Injection Generation with LLM-Based Architectures

Ali Karakoc, H. Birkan Yilmaz

PDF

TL;DR

This paper evaluates the use of Large Language Models (LLMs) for generating adversarial SQL injection payloads to test web application firewalls, introducing two novel LLM-based systems and comparing their effectiveness.

Contribution

Introduces two new LLM-based systems, RADAGAS and RefleXQLi, for adversarial SQL injection generation and provides a comprehensive evaluation against various WAFs.

Findings

01

RADAGAS-GPT4o achieves a 22.73% bypass rate.

02

Highly successful against AI/ML-based WAFs, less so on rule-based WAFs.

03

Less diverse payloads tend to bypass WAFs more effectively.

Abstract

SQL injection (SQLi) attacks are still one of the serious attacks ranked in the Open Worldwide Application Security Project (OWASP) Top 10 threats. Today, with advances in Artificial Intelligence (AI), especially in Large Language Models (LLMs), an opportunity has been created for automating adversarial attack tests to measure the defense mechanisms. In this paper, we aim to create a comprehensive evaluation of use cases that utilize LLMs for adversarial SQL injection generation. We introduce two novel LLM-based systems, Retrieval Augmented Generation for Adversarial SQLi (RADAGAS) and Reflective Chain-of-Thought SQLi (RefleXQLi), and compare them with existing baselines against 10 Web Application Firewalls (WAFs) and one execution-based MySQL validator. To perform a comprehensive test, we used six rule-based open-source WAFs (ModSecurity PL1--3, Coraza PL1--3), 2 AI/ML-based WAFs (WAF…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.