SecureForge: Finding and Preventing Vulnerabilities in LLM-Generated Code via Prompt Optimization
Houjun Liu, Lisa Einstein, John Yang, Joachim Baumann, Duncan Eddy, Christopher D. Manning, Mykel Kochenderfer, Diyi Yang

TL;DR
SecureForge is an automated pipeline that audits and optimizes prompts for large language models to significantly reduce security vulnerabilities in generated code, improving safety without sacrificing performance.
Contribution
It introduces a novel method for identifying and mitigating vulnerabilities in LLM-generated code through prompt optimization and synthetic prompt generation.
Findings
SecureForge reduces output vulnerabilities by up to 48%.
It achieves a Pareto improvement in unit test success and security.
System prompts transfer effectively to real-world coding agents.
Abstract
LLM coding agents now generate code at an unprecedented scale, yet LLM-generated code introduces cybersecurity vulnerabilities into codebases without human involvement. Even when frontier models are explicitly asked to write secure production code with relevant weaknesses to avoid in context, we find that they still produce verifiable vulnerabilities on average 23% of the time across a corpus of 250 benign coding prompts. We introduce SecureForge, an automated pipeline that both audits security risks of frontier models and produces auditing-informed secure system prompts that reduce output security vulnerabilities while maintaining unit test performance. SecureForge first identifies benign prompts that produce statically detectable vulnerabilities, and then amplifies them into a large synthetic prompt corpus of diverse scenarios using a Markovian sampling technique to jointly maintain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
