Gradient Masking Causes CLEVER to Overestimate Adversarial Perturbation Size
Ian Goodfellow

TL;DR
This paper demonstrates that gradient masking causes CLEVER, a method intended to estimate lower bounds of adversarial perturbation, to overestimate the required perturbation size, highlighting limitations in attack-based security assessments.
Contribution
The paper reveals that gradient masking leads CLEVER to overestimate adversarial perturbation bounds, exposing a fundamental flaw in attack-based evaluation methods.
Findings
Gradient masking causes CLEVER to overestimate perturbation size
CLEVER fails to provide a true lower bound on adversarial perturbations
Attack algorithms tend to provide loose upper bounds due to gradient masking
Abstract
A key problem in research on adversarial examples is that vulnerability to adversarial examples is usually measured by running attack algorithms. Because the attack algorithms are not optimal, the attack algorithms are prone to overestimating the size of perturbation needed to fool the target model. In other words, the attack-based methodology provides an upper-bound on the size of a perturbation that will fool the model, but security guarantees require a lower bound. CLEVER is a proposed scoring method to estimate a lower bound. Unfortunately, an estimate of a bound is not a bound. In this report, we show that gradient masking, a common problem that causes attack methodologies to provide only a very loose upper bound, causes CLEVER to overestimate the size of perturbation needed to fool the model. In other words, CLEVER does not resolve the key problem with the attack-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
