SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

Xingjian Zhou; Keyi Shen; Andy Xu; Hongji Xu; Cho-Jui Hsieh; Huan Zhang; Zhouxing Shi

arXiv:2412.03154·cs.LG·January 1, 2026

SoundnessBench: A Soundness Benchmark for Neural Network Verifiers

Xingjian Zhou, Keyi Shen, Andy Xu, Hongji Xu, Cho-Jui Hsieh, Huan Zhang, Zhouxing Shi

PDF

Open Access 1 Repo

TL;DR

SoundnessBench is a new benchmark with hidden counterexamples designed to test the soundness of neural network verifiers, revealing bugs in state-of-the-art tools and improving verification reliability.

Contribution

We introduce SoundnessBench, a novel benchmark with deliberately inserted hidden counterexamples to evaluate and improve the soundness of neural network verifiers.

Findings

01

SoundnessBench effectively identifies bugs in current verifiers.

02

Our training method successfully produces neural networks with hidden counterexamples.

03

The benchmark covers various architectures and activation functions.

Abstract

Neural network (NN) verification aims to formally verify properties of NNs, which is crucial for ensuring the behavior of NN-based models in safety-critical applications. In recent years, the community has developed many NN verifiers and benchmarks to evaluate them. However, existing benchmarks typically lack ground-truth for hard instances where no current verifier can verify the property and no counterexample can be found. This makes it difficult to validate the soundness of a verifier, when it claims verification on such challenging instances that no other verifier can handle. In this work, we develop a new benchmark for NN verification, named SoundnessBench, specifically for testing the soundness of NN verifiers. SoundnessBench consists of instances with deliberately inserted counterexamples that are hidden from adversarial attacks commonly used to find counterexamples. Thereby, it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mvp-harry/soundnessbench
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Adversarial Robustness in Machine Learning