Guiding AI to Fix Its Own Flaws: An Empirical Study on LLM-Driven Secure Code Generation

Hao Yan; Swapneel Suhas Vaidya; Xiaokuan Zhang; Ziyu Yao

arXiv:2506.23034·cs.SE·July 1, 2025

Guiding AI to Fix Its Own Flaws: An Empirical Study on LLM-Driven Secure Code Generation

Hao Yan, Swapneel Suhas Vaidya, Xiaokuan Zhang, Ziyu Yao

PDF

Open Access

TL;DR

This study evaluates how large language models can be guided to generate more secure code and repair vulnerabilities, revealing their potential and limitations in producing safe software through various feedback mechanisms.

Contribution

It provides a comprehensive empirical analysis of LLMs' ability to generate and repair secure code using vulnerability hints and feedback, filling a gap in understanding their security capabilities.

Findings

01

Advanced LLMs can generate more secure code when guided by vulnerability hints.

02

LLMs show potential in repairing code vulnerabilities with appropriate feedback.

03

Models vary in effectiveness depending on their size and training data.

Abstract

Large Language Models (LLMs) have become powerful tools for automated code generation. However, these models often overlook critical security practices, which can result in the generation of insecure code that contains vulnerabilities-weaknesses or flaws in the code that attackers can exploit to compromise a system. However, there has been limited exploration of strategies to guide LLMs in generating secure code and a lack of in-depth analysis of the effectiveness of LLMs in repairing code containing vulnerabilities. In this paper, we present a comprehensive evaluation of state-of-the-art LLMs by examining their inherent tendencies to produce insecure code, their capability to generate secure code when guided by self-generated vulnerability hints, and their effectiveness in repairing vulnerabilities when provided with different levels of feedback. Our study covers both proprietary and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Adversarial Robustness in Machine Learning · Software Engineering Research