Adversarial Examples Are Not Real Features

Ang Li; Yifei Wang; Yiwen Guo; Yisen Wang

arXiv:2310.18936·cs.LG·May 7, 2024·1 cites

Adversarial Examples Are Not Real Features

Ang Li, Yifei Wang, Yiwen Guo, Yisen Wang

PDF

Open Access 1 Repo

TL;DR

This paper challenges the idea that adversarial examples are based on useful non-robust features, showing they are more like shortcuts that do not transfer well across different learning paradigms and do not ensure robustness.

Contribution

The study re-examines the role of non-robust features across multiple learning paradigms, revealing their limited usefulness and their nature as paradigm-specific shortcuts.

Findings

01

Non-robust features transfer poorly across different paradigms.

02

Naturally trained robust features are non-robust under AutoAttack.

03

Non-robust features are more like shortcuts than genuinely useful features.

Abstract

The existence of adversarial examples has been a mystery for years and attracted much interest. A well-known theory by \citet{ilyas2019adversarial} explains adversarial vulnerability from a data perspective by showing that one can extract non-robust features from adversarial examples and these features alone are useful for classification. However, the explanation remains quite counter-intuitive since non-robust features are mostly noise features to humans. In this paper, we re-examine the theory from a larger context by incorporating multiple learning paradigms. Notably, we find that contrary to their good usefulness under supervised learning, non-robust features attain poor usefulness when transferred to other self-supervised learning paradigms, such as contrastive learning, masked image modeling, and diffusion models. It reveals that non-robust features are not really as useful as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pku-ml/advnotrealfeatures
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsDiffusion