Demonstrating ARG-V's Generation of Realistic Java Benchmarks for SV-COMP
Charles Moloney, Robert Dyer, Elena Sherman

TL;DR
This paper introduces ARG-V, a tool that automatically generates realistic Java benchmarks for SV-COMP, revealing how new benchmarks can impact verifier performance and aiding in the development of more applicable verification tools.
Contribution
The paper presents ARG-V, a novel tool for creating realistic Java benchmarks for SV-COMP, demonstrating its effectiveness and impact on verifier evaluation.
Findings
All four leading Java verifiers perform worse on new benchmarks
ARG-V generates benchmarks that challenge existing verification tools
New benchmarks reveal limitations in current verifier capabilities
Abstract
The SV-COMP competition provides a state-of-the-art platform for evaluating software verification tools on a standardized set of verification tasks. Consequently, verifier development outcomes are influenced by the composition of program benchmarks included in SV-COMP. When expanding this benchmark corpus, it is crucial to consider whether newly added programs cause verifiers to exhibit behavior distinct from that observed on existing benchmarks. Doing so helps mitigate external threats to the validity of the competition's results. In this paper, we present the application of the ARG-V tool for automatically generating Java verification benchmarks in the SV-COMP format. We demonstrate that, on a newly generated set of 68 realistic benchmarks, all four leading Java verifiers decrease in accuracy and recall compared to their performance on the existing benchmark suite. These findings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Formal Methods in Verification · Security and Verification in Computing
