TL;DR
Repilot enhances automated program repair by integrating Large Language Models with a Completion Engine, significantly increasing the validity and correctness of generated patches in real-world software systems.
Contribution
This work introduces Repilot, a novel framework that synergistically combines LLMs with a Completion Engine to improve patch synthesis during automated program repair.
Findings
Repilot fixes 27% and 47% more bugs on Defects4j 1.2 and 2.0 datasets.
It produces more valid and correct patches than base LLMs.
The approach is generalizable to other code generation tasks.
Abstract
During Automated Program Repair (APR), it can be challenging to synthesize correct patches for real-world systems in general-purpose programming languages. Recent Large Language Models (LLMs) have been shown to be helpful "copilots" in assisting developers with various coding tasks, and have also been directly applied for patch synthesis. However, most LLMs treat programs as sequences of tokens, meaning that they are ignorant of the underlying semantics constraints of the target programming language. This results in plenty of statically invalid generated patches, impeding the practicality of the technique. Therefore, we propose Repilot, a general code generation framework to further copilot the AI "copilots" (i.e., LLMs) by synthesizing more valid patches during the repair process. Our key insight is that many LLMs produce outputs autoregressively (i.e., token by token), resembling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocus · Repair · Balanced Selection
