Erratum Concerning the Obfuscated Gradients Attack on Stochastic   Activation Pruning

Guneet S. Dhillon; Nicholas Carlini

arXiv:2010.00071·cs.LG·October 2, 2020

Erratum Concerning the Obfuscated Gradients Attack on Stochastic Activation Pruning

Guneet S. Dhillon, Nicholas Carlini

PDF

Open Access

TL;DR

This paper clarifies that the previously reported effectiveness of the Obfuscated Gradients attack on SAP was due to a flaw in implementation, and when correctly applied, SAP remains robust except against a new BPDA-based attack.

Contribution

It identifies a flaw in prior evaluations of SAP's robustness and demonstrates that proper implementation maintains its effectiveness against certain attacks.

Findings

01

Incorrect implementation led to underestimating SAP's robustness

02

Properly implemented SAP resists Obfuscated Gradients attack

03

BPDA attack can still significantly reduce SAP's accuracy

Abstract

Stochastic Activation Pruning (SAP) (Dhillon et al., 2018) is a defense to adversarial examples that was attacked and found to be broken by the "Obfuscated Gradients" paper (Athalye et al., 2018). We discover a flaw in the re-implementation that artificially weakens SAP. When SAP is applied properly, the proposed attack is not effective. However, we show that a new use of the BPDA attack technique can still reduce the accuracy of SAP to 0.1%.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Optical Sensing Technologies · Stochastic Gradient Optimization Techniques

MethodsPruning