Certified Defenses against Adversarial Examples

Aditi Raghunathan; Jacob Steinhardt; Percy Liang

arXiv:1801.09344·cs.LG·November 3, 2020·340 cites

Certified Defenses against Adversarial Examples

Aditi Raghunathan, Jacob Steinhardt, Percy Liang

PDF

Open Access 4 Repos

TL;DR

This paper introduces a semidefinite relaxation-based certification method for neural networks with one hidden layer, providing guarantees against adversarial attacks and jointly optimizing for robustness, demonstrated on MNIST.

Contribution

It presents a novel certification technique using semidefinite relaxation and an adaptive regularizer for neural network robustness against adversarial perturbations.

Findings

01

Certifies robustness for MNIST with epsilon=0.1

02

Guarantees no attack causes more than 35% error

03

Provides a differentiable certificate for joint optimization

Abstract

While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses. Can we somehow end this arms race? In this work, we study this problem for neural networks with one hidden layer. We first propose a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value. Second, as this certificate is differentiable, we jointly optimize it with the network parameters, providing an adaptive regularizer that encourages robustness against all attacks. On MNIST, our approach produces a network and a certificate that no…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning