Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks
Huishuai Zhang, Da Yu, Yiping Lu, Di He

TL;DR
This paper reveals that adversarial noises generated by gradient methods are linearly separable under certain neural network conditions, and this property can be exploited for classification and understanding adversarial perturbations.
Contribution
The paper proves the linear separability of adversarial noises for random neural networks and neural tangent kernel models, supported by theoretical analysis and experiments.
Findings
Adversarial noises are linearly separable with labels in specific neural network setups.
A linear classifier trained on adversarial noises can classify test adversarial noises effectively.
Linearity of adversarial noises diminishes when network conditions deviate from ideal assumptions.
Abstract
Adversarial examples, which are usually generated for specific inputs with a specific model, are ubiquitous for neural networks. In this paper we unveil a surprising property of adversarial noises when they are put together, i.e., adversarial noises crafted by one-step gradient methods are linearly separable if equipped with the corresponding labels. We theoretically prove this property for a two-layer network with randomly initialized entries and the neural tangent kernel setup where the parameters are not far from initialization. The proof idea is to show the label information can be efficiently backpropagated to the input while keeping the linear separability. Our theory and experimental evidence further show that the linear classifier trained with the adversarial noises of the training data can well classify the adversarial noises of the test data, indicating that adversarial noises…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
MethodsTest
