How Secure is Secure Code Generation? Adversarial Prompts Put LLM Defenses to the Test

Melissa Tessa; Iyiola E. Olatunji; Aicha War; Jacques Klein; Tegawend\'e F. Bissyand\'e

arXiv:2601.07084·cs.CR·January 13, 2026

How Secure is Secure Code Generation? Adversarial Prompts Put LLM Defenses to the Test

Melissa Tessa, Iyiola E. Olatunji, Aicha War, Jacques Klein, Tegawend\'e F. Bissyand\'e

PDF

Open Access

TL;DR

This paper systematically evaluates the robustness of secure code generation methods against adversarial prompts, revealing significant gaps in security and functionality that current methods fail to address.

Contribution

It provides the first adversarial audit of secure code generation models, highlighting their vulnerabilities and proposing best practices for improving robustness.

Findings

01

Static analyzers overestimate security by 7 to 21 times

02

37 to 60% of outputs labeled as secure are non-functional

03

Secure-and-functional rates drop to 3 to 17% under adversarial prompts

Abstract

Recent secure code generation methods, using vulnerability-aware fine-tuning, prefix-tuning, and prompt optimization, claim to prevent LLMs from producing insecure code. However, their robustness under adversarial conditions remains untested, and current evaluations decouple security from functionality, potentially inflating reported gains. We present the first systematic adversarial audit of state-of-the-art secure code generation methods (SVEN, SafeCoder, PromSec). We subject them to realistic prompt perturbations such as paraphrasing, cue inversion, and context manipulation that developers might inadvertently introduce or adversaries deliberately exploit. To enable fair comparison, we evaluate all methods under consistent conditions, jointly assessing security and functionality using multiple analyzers and executable tests. Our findings reveal critical robustness gaps: static…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Physical Unclonable Functions (PUFs) and Hardware Security · Security and Verification in Computing