A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests
Emanuele Iannone, Quang-Cuong Bui, Riccardo Scandariato

TL;DR
VuTeCo is an AI framework that automatically identifies and links security-related unit tests to vulnerabilities in Java code, facilitating large-scale collection and understanding of vulnerability-witnessing tests.
Contribution
Introduces VuTeCo, an AI-driven system for detecting security-related unit tests and matching them to vulnerabilities, enabling large-scale data collection for security testing.
Findings
Achieved 0.73 F0.5 score in security test detection.
Achieved 0.65 F0.5 score in matching tests to vulnerabilities.
Collected 224 confirmed security-related tests from real projects.
Abstract
Software vulnerabilities are often detected via taint analysis, penetration testing, or fuzzing. They are also found via unit tests that exercise security-sensitive behavior with specific inputs, called vulnerability-witnessing tests. Generative AI models could help developers in writing them, but they require many examples to learn from, which are currently scarce. This paper introduces VuTeCo, an AI-driven framework for collecting examples of vulnerability-witnessing tests from Java repositories. VuTeCo carries out two tasks: (1) The "Finding" task to determine whether a unit test case is security-related, and (2) the "Matching" task to relate a test case to the vulnerability it witnesses. VuTeCo addresses the Finding task with UniXcoder, achieving an F0.5 score of 0.73 and a precision of 0.83 on a test set of unit tests from Vul4J. The Matching task is addressed using DeepSeek Coder,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗emaiannone/vuteco-dsc-e2emodel· 1 dl1 dl
- 🤗emaiannone/vuteco-uxc-fndmodel
- 🤗emaiannone/vuteco-cb-e2emodel· 1 dl1 dl
- 🤗emaiannone/vuteco-cb-fndmodel· 2 dl2 dl
- 🤗emaiannone/vuteco-cl-e2emodel· 2 dl2 dl
- 🤗emaiannone/vuteco-ct5p-fndmodel· 4 dl4 dl
- 🤗emaiannone/vuteco-ct5p-e2emodel· 1 dl1 dl
- 🤗emaiannone/vuteco-uxc-e2emodel· 4 dl4 dl
- 🤗emaiannone/vuteco-dsc-fndmodel· 1 dl1 dl
- 🤗emaiannone/vuteco-qc-e2emodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Infrastructure Resilience and Vulnerability Analysis
