Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial   Perturbations

Florian Tram\`er; Jens Behrmann; Nicholas Carlini; Nicolas; Papernot; J\"orn-Henrik Jacobsen

arXiv:2002.04599·cs.LG·August 5, 2020·29 cites

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations

Florian Tram\`er, Jens Behrmann, Nicholas Carlini, Nicolas, Papernot, J\"orn-Henrik Jacobsen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores the fundamental tradeoffs between model sensitivity and invariance to adversarial perturbations, revealing that defenses against one type can weaken resistance to the other and highlighting the need for new robust approaches.

Contribution

It introduces the concept of invariance-based adversarial examples, demonstrating their existence and impact, and shows how current defenses can be compromised by these attacks.

Findings

01

State-of-the-art models can be broken by small invariance-based perturbations.

02

Defenses against sensitivity attacks can reduce robustness to invariance attacks.

03

Overly invariant classifiers stem from overly-robust features in datasets.

Abstract

Adversarial examples are malicious inputs crafted to induce misclassification. Commonly studied sensitivity-based adversarial examples introduce semantically-small changes to an input that result in a different model prediction. This paper studies a complementary failure mode, invariance-based adversarial examples, that introduce minimal semantic changes that modify an input's true label yet preserve the model's prediction. We demonstrate fundamental tradeoffs between these two types of adversarial examples. We show that defenses against sensitivity-based attacks actively harm a model's accuracy on invariance-based attacks, and that new approaches are needed to resist both attack types. In particular, we break state-of-the-art adversarially-trained and certifiably-robust models by generating small perturbations that the models are (provably) robust to, yet that change an input's class…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ftramer/Excessive-Invariance
tfOfficial

Videos

Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning