How Safe Are AI-Generated Patches? A Large-scale Study on Security Risks in LLM and Agentic Automated Program Repair on SWE-bench

Amirali Sajadi; Kostadin Damevski; Preetha Chatterjee

arXiv:2507.02976·cs.CR·December 30, 2025

How Safe Are AI-Generated Patches? A Large-scale Study on Security Risks in LLM and Agentic Automated Program Repair on SWE-bench

Amirali Sajadi, Kostadin Damevski, Preetha Chatterjee

PDF

Open Access

TL;DR

This large-scale study assesses the security risks of AI-generated patches in automated program repair, revealing that LLMs and agentic frameworks can introduce vulnerabilities influenced by code and issue context.

Contribution

First comprehensive security analysis of LLM-generated patches on real-world GitHub issues, highlighting vulnerabilities and contextual factors affecting security in automated program repair.

Findings

01

Llama 3.3 introduces many new vulnerabilities.

02

Agentic frameworks generate vulnerabilities with increased autonomy.

03

Vulnerabilities are linked to specific code and issue characteristics.

Abstract

Large language models (LLMs) and their agentic frameworks are increasingly adopted to perform development tasks such as automated program repair (APR). While prior work has identified security risks in LLM-generated code, most have focused on synthetic, simplified, or isolated tasks that lack the complexity of real-world program repair. In this study, we present the first large-scale security analysis of LLM-generated patches using 20,000+ GitHub issues. We evaluate patches proposed by developers, a standalone LLM (Llama 3.3 Instruct-70B), and three top-performing agentic frameworks (OpenHands, AutoCodeRover, HoneyComb). Finally, we analyze a wide range of code, issue, and project-level factors to understand the conditions under which generating insecure patches is more likely. Our findings reveal that Llama introduces many new vulnerabilities, exhibiting unique patterns not found in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBlockchain Technology Applications and Security · Law, AI, and Intellectual Property · Ethics and Social Impacts of AI