Semidefinite relaxations for certifying robustness to adversarial examples
Aditi Raghunathan, Jacob Steinhardt, Percy Liang

TL;DR
This paper introduces a new semidefinite relaxation method that provides tighter robustness guarantees for ReLU neural networks against adversarial attacks, applicable to arbitrary network architectures.
Contribution
A novel semidefinite relaxation technique that improves the certification of neural network robustness, outperforming previous relaxations on diverse network architectures.
Findings
Tighter robustness bounds than previous relaxations.
Effective on networks trained without considering the relaxation.
Provides meaningful robustness guarantees across different architectures.
Abstract
Despite their impressive performance on diverse tasks, neural networks fail catastrophically in the presence of adversarial inputs---imperceptibly but adversarially perturbed versions of natural inputs. We have witnessed an arms race between defenders who attempt to train robust networks and attackers who try to construct adversarial examples. One promise of ending the arms race is developing certified defenses, ones which are provably robust against all attackers in some family. These certified defenses are based on convex relaxations which construct an upper bound on the worst case loss over all attackers in the family. Previous relaxations are loose on networks that are not trained against the respective relaxation. In this paper, we propose a new semidefinite relaxation for certifying robustness that applies to arbitrary ReLU networks. We show that our proposed relaxation is tighter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
