PurpCode: Reasoning for Safer Code Generation

Jiawei Liu; Nirav Diwan; Zhe Wang; Haoyu Zhai; Xiaona Zhou; Kiet A. Nguyen; Tianjiao Yu; Muntasir Wahed; Yinlin Deng; Hadjer Benkraouda; Yuxiang Wei; Lingming Zhang; Ismini Lourentzou; Gang Wang

arXiv:2507.19060·cs.CR·November 18, 2025

PurpCode: Reasoning for Safer Code Generation

Jiawei Liu, Nirav Diwan, Zhe Wang, Haoyu Zhai, Xiaona Zhou, Kiet A. Nguyen, Tianjiao Yu, Muntasir Wahed, Yinlin Deng, Hadjer Benkraouda, Yuxiang Wei, Lingming Zhang, Ismini Lourentzou, Gang Wang

PDF

Open Access 1 Datasets 1 Video

TL;DR

PurpCode is a novel two-stage training approach for creating safer code generation models that reference cybersecurity rules and use reinforcement learning to enhance safety without sacrificing utility.

Contribution

It introduces a new post-training recipe combining rule learning and reinforcement learning to improve the safety of code generation models against cyber threats.

Findings

01

PurpCode-32B outperforms existing models in cybersafety.

02

The alignment method reduces overrefusal rates.

03

The approach maintains code utility and security knowledge.

Abstract

We introduce PurpCode, the first post-training recipe for training safe code reasoning models towards generating secure code and defending against malicious cyberactivities. PurpCode trains a reasoning model in two stages: (i) Rule Learning, which explicitly teaches the model to reference cybersafety rules to generate vulnerability-free code and to avoid facilitating malicious cyberactivities; and (ii) Reinforcement Learning, which optimizes model safety and preserves model utility through diverse, multi-objective reward mechanisms. To empower the training pipelines with comprehensive cybersafety data, we conduct internal red-teaming to synthesize comprehensive and high-coverage prompts based on real-world tasks for inducing unsafe cyberactivities in the model. Based on PurpCode, we develop a reasoning-based coding model, namely PurpCode-32B, which demonstrates state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

LeTue09/train_safetycode_instruct_v1
dataset· 22 dl
22 dl

Videos

PurpCode: Reasoning for Safer Code Generation· slideslive

Taxonomy

TopicsFormal Methods in Verification · Software Testing and Debugging Techniques · Software Reliability and Analysis Research