Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

Shravya Kanchi; Xiaoyan Zang; Ying Zhang; Danfeng Yao; Na Meng

arXiv:2605.03956·cs.CR·May 6, 2026

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

Shravya Kanchi, Xiaoyan Zang, Ying Zhang, Danfeng Yao, Na Meng

PDF

TL;DR

PoVSmith automates the generation of proof-of-vulnerability tests for complex software, improving security assessment by leveraging large language models and call path analysis.

Contribution

It introduces an agent-based, iterative approach combining call analysis, code context, and feedback to enhance test generation for software vulnerabilities.

Findings

01

Successfully identified 96% of call points to vulnerable APIs.

02

Generated 152 tests with 55% demonstrating feasible exploits.

03

Outperformed existing LLM-based methods in test quality and automation.

Abstract

Developers create modern software applications (Apps) on top of third-party libraries (Libs). When library vulnerabilities are reachable through application code, the applications can be vulnerable to software supply chain attacks. Prior work shows that developers often require concrete and executable evidence, i.e., proof-of-vulnerability (PoV) tests, to decide whether a reported dependency vulnerability poses a practical security risk to their application. However, manually crafting such tests is challenging, and existing tool support is insufficient to automate the procedure. To streamline test generation, we created PoVSmith -- a new approach that combines call path analysis, exemplar test, code context, and feedback into multiple prompts to guide a coding agent (i.e., Codex) and a large language model (i.e., GPT) for test generation, execution, and assessment. We evaluated PoVSmith…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.