A Geometric Perspective on the Transferability of Adversarial Directions
Zachary Charles, Harrison Rosenberg, Dimitris Papailiopoulos

TL;DR
This paper investigates the transferability of adversarial perturbations across different classifiers and neural network architectures, providing theoretical guarantees and empirical validation for the existence and transfer of such directions.
Contribution
It introduces the concept of transferable adversarial directions, proves their existence for linear and certain ReLU networks, and explores their transferability properties.
Findings
Transferable adversarial directions exist for linear classifiers and two-layer ReLU networks.
Adversarial directions for ReLU networks can transfer to linear classifiers.
Empirical validation confirms theoretical results even for deeper networks.
Abstract
State-of-the-art machine learning models frequently misclassify inputs that have been perturbed in an adversarial manner. Adversarial perturbations generated for a given input and a specific classifier often seem to be effective on other inputs and even different classifiers. In other words, adversarial perturbations seem to transfer between different inputs, models, and even different neural network architectures. In this work, we show that in the context of linear classifiers and two-layer ReLU networks, there provably exist directions that give rise to adversarial perturbations for many classifiers and data points simultaneously. We show that these "transferable adversarial directions" are guaranteed to exist for linear separators of a given set, and will exist with high probability for linear classifiers trained on independent sets drawn from the same distribution. We extend our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia?
