SecPI: Secure Code Generation with Reasoning Models via Security Reasoning Internalization
Hao Wang, Niels M\"undler, Mark Vero, Jingxuan He, Dawn Song, Martin Vechev

TL;DR
SecPI is a fine-tuning pipeline that enables reasoning language models to generate secure code by internalizing structured security reasoning, improving security and correctness without explicit security prompts.
Contribution
It introduces a novel fine-tuning approach that teaches RLMs to autonomously reason about security vulnerabilities, overcoming dataset limitations and inference-time issues.
Findings
SecPI increases secure, functionally correct code generation from 48.2% to 62.2% on CWEval.
SecPI improves secure code generation from 18.2% to 22.0% on BaxBench.
The approach generalizes across different CWEs and programming languages.
Abstract
Reasoning language models (RLMs) are increasingly used in programming. Yet, even state-of-the-art RLMs frequently introduce critical security vulnerabilities in generated code. Prior training-based approaches for secure code generation face a critical limitation that prevents their direct application to RLMs: they rely on costly, manually curated security datasets covering only a limited set of vulnerabilities. At the inference level, generic security reminders consistently degrade functional correctness while triggering only shallow ad-hoc vulnerability analysis. To address these problems, we present SecPI, a fine-tuning pipeline that teaches RLMs to internalize structured security reasoning, producing secure code by default without any security instructions at inference time. SecPI filters existing general-purpose coding datasets for security-relevant tasks using an LLM-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
