A Geometric Perspective on the Transferability of Adversarial Directions

Zachary Charles; Harrison Rosenberg; Dimitris Papailiopoulos

arXiv:1811.03531·cs.LG·November 9, 2018·6 cites

A Geometric Perspective on the Transferability of Adversarial Directions

Zachary Charles, Harrison Rosenberg, Dimitris Papailiopoulos

PDF

Open Access

TL;DR

This paper investigates the transferability of adversarial perturbations across different classifiers and neural network architectures, providing theoretical guarantees and empirical validation for the existence and transfer of such directions.

Contribution

It introduces the concept of transferable adversarial directions, proves their existence for linear and certain ReLU networks, and explores their transferability properties.

Findings

01

Transferable adversarial directions exist for linear classifiers and two-layer ReLU networks.

02

Adversarial directions for ReLU networks can transfer to linear classifiers.

03

Empirical validation confirms theoretical results even for deeper networks.

Abstract

State-of-the-art machine learning models frequently misclassify inputs that have been perturbed in an adversarial manner. Adversarial perturbations generated for a given input and a specific classifier often seem to be effective on other inputs and even different classifiers. In other words, adversarial perturbations seem to transfer between different inputs, models, and even different neural network architectures. In this work, we show that in the context of linear classifiers and two-layer ReLU networks, there provably exist directions that give rise to adversarial perturbations for many classifiers and data points simultaneously. We show that these "transferable adversarial directions" are guaranteed to exist for linear separators of a given set, and will exist with high probability for linear classifiers trained on independent sets drawn from the same distribution. We extend our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia?