Transferable Perturbations of Deep Feature Distributions

Nathan Inkawhich; Kevin J Liang; Lawrence Carin; Yiran Chen

arXiv:2004.12519·cs.LG·April 28, 2020·20 cites

Transferable Perturbations of Deep Feature Distributions

Nathan Inkawhich, Kevin J Liang, Lawrence Carin, Yiran Chen

PDF

Open Access

TL;DR

This paper introduces a novel adversarial attack method that exploits deep feature distributions in CNNs, achieving state-of-the-art transferability and emphasizing explainability of how attacks alter internal features.

Contribution

It presents a new attack based on class-wise and layer-wise feature distributions, improving transferability and providing insights into feature distribution changes during attacks.

Findings

01

Achieves state-of-the-art targeted blackbox transfer attacks on ImageNet.

02

Provides analysis of feature distribution changes caused by adversarial attacks.

03

Introduces a framework for understanding feature separability and entanglement in CNNs.

Abstract

Almost all current adversarial attacks of CNN classifiers rely on information derived from the output layer of the network. This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions. We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models. Further, we place a priority on explainability and interpretability of the attacking process. Our methodology affords an analysis of how adversarial attacks change the intermediate feature distributions of CNNs, as well as a measure of layer-wise and class-wise feature distributional separability/entanglement. We also conceptualize a transition from task/data-specific to model-specific features within a CNN architecture that directly impacts the transferability of adversarial examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Anomaly Detection Techniques and Applications

MethodsInterpretability