Patch Validation in Automated Vulnerability Repair
Zheng Yu, Wenxuan Shi, Xinqian Sun, Zheyun Feng, Meng Xu, Xinyu Xing

TL;DR
This paper highlights the importance of validating automated vulnerability patches against comprehensive tests, revealing that many patches deemed correct fail under more rigorous PoC+ tests, thus exposing overestimated success rates.
Contribution
The authors introduce PVBench, a benchmark with 209 cases and PoC+ tests, to evaluate AVR systems' patch validation beyond basic tests, emphasizing the need for improved validation methods.
Findings
Over 40% of patches pass basic tests but fail PoC+ tests.
Current AVR systems overestimate patch success rates.
Improvement areas include root cause analysis and adherence to specifications.
Abstract
Automated Vulnerability Repair (AVR) systems, especially those leveraging large language models (LLMs), have demonstrated promising results in patching vulnerabilities -- that is, if we trust their patch validation methodology. Ground-truth patches from human developers often come with new tests that not only ensure mitigation of the vulnerability but also encode extra semantics such as root cause location, optimal fix strategy, or subtle coding styles or conventions. And yet, none of the recent AVR systems verify that the auto-generated patches additionally pass these new tests (termed as tests). This is a subtle yet critical omission. To fill this gap, we constructed a benchmark, , with 209 cases spanning 20 projects. Each case includes basic tests (functional tests before the patch and the PoC exploit) as well as the associated tests.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Web Application Security Vulnerabilities
