Erratum Concerning the Obfuscated Gradients Attack on Stochastic Activation Pruning
Guneet S. Dhillon, Nicholas Carlini

TL;DR
This paper clarifies that the previously reported effectiveness of the Obfuscated Gradients attack on SAP was due to a flaw in implementation, and when correctly applied, SAP remains robust except against a new BPDA-based attack.
Contribution
It identifies a flaw in prior evaluations of SAP's robustness and demonstrates that proper implementation maintains its effectiveness against certain attacks.
Findings
Incorrect implementation led to underestimating SAP's robustness
Properly implemented SAP resists Obfuscated Gradients attack
BPDA attack can still significantly reduce SAP's accuracy
Abstract
Stochastic Activation Pruning (SAP) (Dhillon et al., 2018) is a defense to adversarial examples that was attacked and found to be broken by the "Obfuscated Gradients" paper (Athalye et al., 2018). We discover a flaw in the re-implementation that artificially weakens SAP. When SAP is applied properly, the proposed attack is not effective. However, we show that a new use of the BPDA attack technique can still reduce the accuracy of SAP to 0.1%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Optical Sensing Technologies · Stochastic Gradient Optimization Techniques
MethodsPruning
