PAGENT: Learning to Patch Software Engineering Agents

Haoran Xue; Gias Uddin; Song Wang

arXiv:2506.17772·cs.SE·June 24, 2025

PAGENT: Learning to Patch Software Engineering Agents

Haoran Xue, Gias Uddin, Song Wang

PDF

TL;DR

This paper empirically analyzes the causes of failed patches generated by LLM-based code agents, introduces PAGENT to address type-related errors using static analysis and LLM inference, and demonstrates its effectiveness in fixing such patches.

Contribution

It provides a detailed taxonomy of failure reasons in LLM-generated patches and proposes PAGENT, a novel hybrid approach combining static analysis and LLM inference to improve patch correctness.

Findings

01

PAGENT fixed 29 out of 127 type-related failed patches.

02

Seven top LLM code agents produced 769 failed patches across 114 issues.

03

Failure reasons include incorrect variable type inference and other categories.

Abstract

LLM Agents produce patches automatically to resolve an issue. However, they can generate inaccurate patches. Little is known about the root causes behind those failed patches or how those could be fixed. This paper reports an empirical study of the failed patches generated by seven top LLM code agents. We collected 114 issues from the SWE-bench Lite dataset that remained unresolved across the agents. The seven agents produced a total of 769 failed patches for those issues, which we checked with a combination of GPT-4o and manual analysis. We present a taxonomy of the failure reasons across the patches. The taxonomy contains six categories, with several sub-categories under each category. For example, a frequently observed category is the inability of an LLM to correctly infer/produce the appropriate variable type in the produced patch. As a first step towards addressing such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.