Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection

Divyesh Gabbireddy; Suman Saha

arXiv:2604.19526·cs.CR·April 22, 2026

Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection

Divyesh Gabbireddy, Suman Saha

PDF

TL;DR

This paper develops a pipeline using large language models to generate and evaluate obfuscated XSS payloads based on runtime behavior, revealing current limitations in behavior preservation and detection improvement.

Contribution

It introduces a structured approach combining deterministic transformations and LLMs with runtime evaluation to generate behavior-preserving obfuscated XSS payloads.

Findings

01

Baseline LLMs achieve 0.15 behavior match rate

02

Fine-tuning improves match rate to 0.22

03

Adding generated payloads does not enhance detection performance

Abstract

Cross-site scripting (XSS) remains a persistent web security vulnerability, especially because obfuscation can change the surface form of a malicious payload while preserving its behavior. These transformations make it difficult for traditional and machine learning-based detection systems to reliably identify attacks. Existing approaches for generating obfuscated payloads often emphasize syntactic diversity, but they do not always ensure that the generated samples remain behaviorally valid. This paper presents a structured pipeline for generating and evaluating obfuscated XSS payloads using large language models (LLMs). The pipeline combines deterministic transformation techniques with LLM-based generation and uses a browser- based runtime evaluation procedure to compare payload behavior in a controlled execution environment. This allows generated samples to be assessed through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.