From Benchmark Data To Applicable Program Repair: An Experience Report
Mahinthan Chandramohan, Jovan Jancic, Yuntong Zhang, Padmanabhan Krishnan

TL;DR
This paper evaluates automated program repair techniques, highlighting their limitations on real-world defects and proposing the augmentation of code with formal specifications to improve repair quality, while discussing challenges and future directions.
Contribution
It demonstrates that combining multiple repair techniques improves benchmark performance but faces challenges in real-world applicability, emphasizing the need for richer specifications and verification tools.
Findings
Techniques outperform benchmarks but struggle with industry defects.
Augmenting code with formal specs improves test generation for complex errors.
Passing tests do not guarantee correct patches in real-world scenarios.
Abstract
This paper describes our approach to automated program repair. We combine various techniques from the literature to achieve this. Our experiments show that our approach performs better than other techniques on standard benchmarks. However, on closer inspection, none of these techniques work on realistic defects that we see in industry. We find that augmenting code with formal specifications enables LLMs to generate higher-quality unit tests, especially for complex production code with improved coverage of edge cases and exception handling. However, specifications add little value for well-understood errors (e.g., null pointer, index out of bounds), but are beneficial for logic and string manipulation errors. Despite encouraging benchmark results, real-world adoption is limited since passing tests do not guarantee correct patches. Current challenges include insufficient expressiveness…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
