Leveraging GPT-4 for Vulnerability-Witnessing Unit Test Generation
G\'abor Antal, D\'enes B\'an, Martin Isztin, Rudolf Ferenc, and P\'eter Heged\H{u}s

TL;DR
This paper investigates GPT-4's ability to automatically generate unit tests that witness software vulnerabilities, demonstrating promising syntactic correctness and potential for aiding security testing despite some limitations.
Contribution
It presents an empirical study on GPT-4's effectiveness in generating vulnerability-witnessing tests from real vulnerability datasets, highlighting its potential in automated security testing.
Findings
GPT-4 generates syntactically correct tests 66.5% of the time
Semantic correctness validation is possible in 7.5% of cases
Generated tests can be manually refined into effective vulnerability witnesses
Abstract
In the life-cycle of software development, testing plays a crucial role in quality assurance. Proper testing not only increases code coverage and prevents regressions but it can also ensure that any potential vulnerabilities in the software are identified and effectively fixed. However, creating such tests is a complex, resource-consuming manual process. To help developers and security experts, this paper explores the automatic unit test generation capability of one of the most widely used large language models, GPT-4, from the perspective of vulnerabilities. We examine a subset of the VUL4J dataset containing real vulnerabilities and their corresponding fixes to determine whether GPT-4 can generate syntactically and/or semantically correct unit tests based on the code before and after the fixes as evidence of vulnerability mitigation. We focus on the impact of code contexts, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Security and Verification in Computing · Adversarial Robustness in Machine Learning
MethodsDropout · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Layer Normalization · Dense Connections · Softmax · Transformer · GPT-4 · Focus
