Robustness Certificates for Sparse Adversarial Attacks by Randomized   Ablation

Alexander Levine; Soheil Feizi

arXiv:1911.09272·cs.LG·November 22, 2019

Robustness Certificates for Sparse Adversarial Attacks by Randomized Ablation

Alexander Levine, Soheil Feizi

PDF

1 Repo

TL;DR

This paper introduces a novel randomized ablation technique to certify and improve classifier robustness against sparse adversarial attacks, extending robustness guarantees to the L_0 threat model.

Contribution

It proposes an efficient, certifiably robust defense using feature ablation, providing tighter robustness certificates than previous additive noise methods.

Findings

01

Certifies over 50% of MNIST images to be robust to 8-pixel distortions.

02

Achieves median robustness of 8 pixels on MNIST, outperforming prior noise-based certificates.

03

Demonstrates high empirical robustness to sparse attacks with only slight accuracy decrease.

Abstract

Recently, techniques have been developed to provably guarantee the robustness of a classifier to adversarial perturbations of bounded L_1 and L_2 magnitudes by using randomized smoothing: the robust classification is a consensus of base classifications on randomly noised samples where the noise is additive. In this paper, we extend this technique to the L_0 threat model. We propose an efficient and certifiably robust defense against sparse adversarial attacks by randomly ablating input features, rather than using additive noise. Experimentally, on MNIST, we can certify the classifications of over 50% of images to be robust to any distortion of at most 8 pixels. This is comparable to the observed empirical robustness of unprotected classifiers on MNIST to modern L_0 attacks, demonstrating the tightness of the proposed robustness certificate. We also evaluate our certificate on ImageNet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

alevine0/randomizedAblation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.