Closer Look at the Transferability of Adversarial Examples: How They   Fool Different Models Differently

Futa Waseda; Sosuke Nishikawa; Trung-Nghia Le; Huy H. Nguyen; and Isao; Echizen

arXiv:2112.14337·cs.LG·October 21, 2022·5 cites

Closer Look at the Transferability of Adversarial Examples: How They Fool Different Models Differently

Futa Waseda, Sosuke Nishikawa, Trung-Nghia Le, Huy H. Nguyen, and Isao, Echizen

PDF

Open Access

TL;DR

This paper investigates how adversarial examples transfer between models, revealing that they often cause the same misclassification, but can also lead to different errors due to non-robust features, enhancing understanding of transferability mechanisms.

Contribution

It introduces a class-aware analysis of adversarial transferability, distinguishing same and different mistakes, and links these to non-robust features used differently by models.

Findings

01

Adversarial examples often cause same mistakes across models.

02

Different mistakes can occur even between similar models.

03

Non-robust features explain class-aware transferability.

Abstract

Deep neural networks are vulnerable to adversarial examples (AEs), which have adversarial transferability: AEs generated for the source model can mislead another (target) model's predictions. However, the transferability has not been understood in terms of to which class target model's predictions were misled (i.e., class-aware transferability). In this paper, we differentiate the cases in which a target model predicts the same wrong class as the source model ("same mistake") or a different wrong class ("different mistake") to analyze and provide an explanation of the mechanism. We find that (1) AEs tend to cause same mistakes, which correlates with "non-targeted transferability"; however, (2) different mistakes occur even between similar models, regardless of the perturbation size. Furthermore, we present evidence that the difference between same mistakes and different mistakes can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications