RedCoder: Automated Multi-Turn Red Teaming for Code LLMs

Wenjie Jacky Mo; Qin Liu; Xiaofei Wen; Dongwon Jung; Hadi Askari; Wenxuan Zhou; Zhe Zhao; Muhao Chen

arXiv:2507.22063·cs.SE·July 31, 2025

RedCoder: Automated Multi-Turn Red Teaming for Code LLMs

Wenjie Jacky Mo, Qin Liu, Xiaofei Wen, Dongwon Jung, Hadi Askari, Wenxuan Zhou, Zhe Zhao, Muhao Chen

PDF

1 Models

TL;DR

RedCoder is an automated multi-turn red-teaming framework that interacts with code generation models to identify vulnerabilities, using a multi-agent game to develop attack strategies and fine-tuning an LLM for dynamic adversarial conversations.

Contribution

This paper introduces RedCoder, a novel multi-turn red-teaming approach that automates vulnerability detection in Code LLMs through interactive conversations and strategy reuse.

Findings

01

RedCoder outperforms prior methods in inducing vulnerabilities.

02

It effectively automates multi-turn interactions for security testing.

03

The approach is scalable and adaptable across different Code LLMs.

Abstract

Large Language Models (LLMs) for code generation (i.e., Code LLMs) have demonstrated impressive capabilities in AI-assisted software development and testing. However, recent studies have shown that these models are prone to generating vulnerable or even malicious code under adversarial settings. Existing red-teaming approaches rely on extensive human effort, limiting their scalability and practicality, and generally overlook the interactive nature of real-world AI-assisted programming, which often unfolds over multiple turns. To bridge these gaps, we present RedCoder, a red-teaming agent that engages victim models in multi-turn conversation to elicit vulnerable code. The pipeline to construct RedCoder begins with a multi-agent gaming process that simulates adversarial interactions, yielding a set of prototype conversations and an arsenal of reusable attack strategies. We then fine-tune…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
jackysnake/RedCoder
model· 3 dl
3 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.