Measuring the False Sense of Security

Carlos Gomes

arXiv:2204.04778·cs.LG·April 12, 2022

Measuring the False Sense of Security

Carlos Gomes

PDF

Open Access

TL;DR

This paper introduces new metrics to quantify gradient masking in neural networks, enabling efficient comparison of defenses and revealing the varying degrees of this phenomenon across models.

Contribution

It proposes and empirically validates computationally inexpensive metrics for measuring gradient masking, moving beyond binary assessments.

Findings

01

Metrics effectively measure gradient masking levels

02

Metrics enable comparison across different networks

03

Gradient masking varies in extent among models

Abstract

Recently, several papers have demonstrated how widespread gradient masking is amongst proposed adversarial defenses. Defenses that rely on this phenomenon are considered failed, and can easily be broken. Despite this, there has been little investigation into ways of measuring the phenomenon of gradient masking and enabling comparisons of its extent amongst different networks. In this work, we investigate gradient masking under the lens of its mensurability, departing from the idea that it is a binary phenomenon. We propose and motivate several metrics for it, performing extensive empirical tests on defenses suspected of exhibiting different degrees of gradient masking. These are computationally cheaper than strong attacks, enable comparisons between models, and do not require the large time investment of tailor-made attacks for specific models. Our results reveal metrics that are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTerrorism, Counterterrorism, and Political Violence · Adversarial Robustness in Machine Learning