DebugRepair: Enhancing LLM-Based Automated Program Repair via Self-Directed Debugging
Linhao Wu, Yifei Pei, Zhen Yang, Kainan Li, Zhonghang Lu, Hao Tan, Xiran Lyu, Jia Li, Yizhou Chen, Pengyu Xue, Kunwu Zheng, Dan Hao

TL;DR
DebugRepair introduces a self-directed debugging framework that enhances LLM-based automated program repair by incorporating intermediate runtime evidence, leading to significant improvements over existing methods.
Contribution
It proposes a novel framework that collects runtime evidence through simulated debugging to improve patch refinement in LLM-based APR.
Findings
Achieves state-of-the-art performance on multiple benchmarks.
Correctly fixes 224 bugs on Defects4J with GPT-3.5, outperforming previous methods.
Improves repair success rate by 51.3% across various LLMs.
Abstract
Automated Program Repair (APR) has benefited from the code understanding and generation capabilities of Large Language Models (LLMs). Existing feedback-based APR methods iteratively refine candidate patches using test execution feedback and have shown promising results. However, most rely on outcome-level failure symptoms, such as stack traces, which show how failures are observed but fail to expose the intermediate runtime states critical for root-cause analysis. As a result, LLMs often infer bug causes without sufficient runtime evidence, leading to incorrect patches. To address this limitation, we propose DebugRepair, a self-directed debugging framework for LLM-based APR. DebugRepair enhances patch refinement with intermediate runtime evidence collected through simulated debugging. It consists of three components: test semantic purification, simulated instrumentation, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
