Exact Certification of Neural Networks and Partition Aggregation Ensembles against Label Poisoning
Ajinkya Mohgaonkar, Lukas Gosch, Mahalakshmi Sabanayagam, Debarghya Ghoshdastidar, Stephan G\"unnemann

TL;DR
This paper introduces EnsembleCert, a white-box certification framework for neural network ensembles against label-flipping attacks, offering tighter guarantees and efficient exact certificates leveraging neural tangent kernels.
Contribution
It presents EnsembleCert, the first white-box certification method for partition-aggregation ensembles, and ScaLabelCert, an exact polynomial-time certificate for neural networks against label-flipping.
Findings
EnsembleCert outperforms black-box methods with up to 26.5% more label flips certified.
ScaLabelCert provides the first exact, polynomial-time certificate for neural networks.
Our approach requires 100 times fewer partitions for certification, challenging previous assumptions.
Abstract
Label-flipping attacks, which corrupt training labels to induce misclassifications at inference, remain a major threat to supervised learning models. This drives the need for robustness certificates that provide formal guarantees about a model's robustness under adversarially corrupted labels. Existing certification frameworks rely on ensemble techniques such as smoothing or partition-aggregation, but treat the corresponding base classifiers as black boxes, yielding overly conservative guarantees. We introduce EnsembleCert, the first certification framework for partition-aggregation ensembles that utilizes white-box knowledge of the base classifiers. Concretely, EnsembleCert yields tighter guarantees than black-box approaches by aggregating per-partition white-box certificates to compute ensemble-level guarantees in polynomial time. To extract white-box knowledge from the base…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
