An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems
Fan Long, Martin Rinard

TL;DR
This paper systematically analyzes the search spaces of automatic patch generation systems, revealing the scarcity of correct patches, the abundance of passing incorrect patches, and the impact of search space size on system effectiveness.
Contribution
It provides the first comprehensive analysis of patch search space characteristics, highlighting key tradeoffs and challenges in automatic patch generation.
Findings
Correct patches are rare in search spaces.
Incorrect patches passing tests are common.
Larger search spaces can reduce the number of correct patches found.
Abstract
We present the first systematic analysis of the characteristics of patch search spaces for automatic patch generation systems. We analyze the search spaces of two current state-of-the-art systems, SPR and Prophet, with 16 different search space configurations. Our results are derived from an analysis of 1104 different search spaces and 768 patch generation executions. Together these experiments consumed over 9000 hours of CPU time on Amazon EC2. The analysis shows that 1) correct patches are sparse in the search spaces (typically at most one correct patch per search space per defect), 2) incorrect patches that nevertheless pass all of the test cases in the validation test suite are typically orders of magnitude more abundant, and 3) leveraging information other than the test suite is therefore critical for enabling the system to successfully isolate correct patches. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Advanced Malware Detection Techniques · Software Engineering Research
