A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming

Yizhong Ding

arXiv:2505.24252·cs.CR·June 2, 2025

A Reward-driven Automated Webshell Malicious-code Generator for Red-teaming

Yizhong Ding

PDF

Open Access

TL;DR

This paper introduces RAWG, a reward-driven, automated webshell malicious-code generator that uses large language models and reinforcement learning to produce diverse, obfuscated payloads for red-teaming, addressing dataset scarcity and redundancy issues.

Contribution

RAWG is a novel framework combining supervised fine-tuning and reinforcement learning to generate highly diverse and obfuscated webshell malicious code, improving over existing methods.

Findings

01

RAWG achieves higher payload diversity than state-of-the-art methods.

02

RAWG produces more effective obfuscated payloads that evade detection.

03

Extensive experiments validate RAWG's superior performance in red-teaming scenarios.

Abstract

Frequent cyber-attacks have elevated WebShell exploitation and defense to a critical research focus within network security. However, there remains a significant shortage of publicly available, well-categorized malicious-code datasets organized by obfuscation method. Existing malicious-code generation methods, which primarily rely on prompt engineering, often suffer from limited diversity and high redundancy in the payloads they produce. To address these limitations, we propose \textbf{RAWG}, a \textbf{R}eward-driven \textbf{A}utomated \textbf{W}ebshell Malicious-code \textbf{G}enerator designed for red-teaming applications. Our approach begins by categorizing webshell samples from common datasets into seven distinct types of obfuscation. We then employ a large language model (LLM) to extract and normalize key tokens from each sample, creating a standardized, high-quality corpus. Using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Advanced Malware Detection Techniques · Caching and Content Delivery

MethodsFocus