Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Anish Athalye, Nicholas Carlini, David Wagner

TL;DR
This paper reveals that defenses relying on obfuscated gradients create a false sense of security against adversarial attacks, but can be systematically circumvented with new attack techniques, exposing their vulnerabilities.
Contribution
The paper identifies and characterizes obfuscated gradients, develops attack methods to bypass them, and demonstrates their prevalence and vulnerability in recent defenses.
Findings
Obfuscated gradients are common in recent defenses.
Most defenses relying on obfuscated gradients can be circumvented.
Obfuscated gradients do not provide true security against adversarial attacks.
Abstract
We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
#040 - Adversarial Examples (Dr. Nicholas Carlini, Dr. Wieland Brendel, Florian Tramèr)· youtube
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security
