Stress-Testing Neural Network Verifiers with Provably Robust Instances

David Troxell; Yulia Alexandr; Sofia Hunt; Stephanie Lei; Guido Mont\'ufar

arXiv:2605.17153·cs.LG·May 19, 2026

Stress-Testing Neural Network Verifiers with Provably Robust Instances

David Troxell, Yulia Alexandr, Sofia Hunt, Stephanie Lei, Guido Mont\'ufar

PDF

1 Repo

TL;DR

This paper introduces a framework for generating neural network verification instances with known ground-truth labels, enabling systematic stress-testing and analysis of verifier failure modes.

Contribution

It provides a reusable method for creating verification benchmarks with ground-truth labels and a new difficulty profile to analyze verifier robustness.

Findings

01

Discovered numeric tolerance issues in popular verifiers.

02

Identified an implementation bug affecting verifier accuracy.

03

Different instances stress different aspects of verification pipelines.

Abstract

Neural network verifiers aim to provide formal guarantees on model behavior, but existing verification benchmarks are fundamentally limited by their lack of ground-truth labels. As a result, verifier evaluation relies on indirect heuristics, which prevents exact scoring and systematic study of verifier failure modes. We address this gap by introducing a reusable framework for generating verification instances whose ground-truth robustness labels are known a priori through analytic construction. Our framework led to the discovery of multiple numeric tolerance concerns and an implementation bug in popular verifiers, highlighting the need for ground-truth labels. Additionally, to systematically study verifier failure modes, we introduce the verification Difficulty Profile, a collection of estimable quantities capturing distinct sources of instance hardness. Using our framework and these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dtroxell19/VeriStressGT.git
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.