Principal Component Properties of Adversarial Samples
Malhar Jere, Sandro Herbig, Christine Lind, Farinaz Koushanfar

TL;DR
This paper investigates how adversarial samples affect the principal component structure of images in neural networks, proposing a new robustness metric that effectively detects adversarial inputs across various models and attacks.
Contribution
It introduces a novel PCA-based analysis of adversarial samples and a new metric, the (k,p) point, to measure and detect adversarial robustness in neural networks.
Findings
Adversarial samples show similar principal component contributions across different attacks.
The (k,p) point metric achieves over 93% accuracy in detecting adversarial samples.
The analysis is consistent across multiple neural network architectures and attack types.
Abstract
Deep Neural Networks for image classification have been found to be vulnerable to adversarial samples, which consist of sub-perceptual noise added to a benign image that can easily fool trained neural networks, posing a significant risk to their commercial deployment. In this work, we analyze adversarial samples through the lens of their contributions to the principal components of each image, which is different than prior works in which authors performed PCA on the entire dataset. We investigate a number of state-of-the-art deep neural networks trained on ImageNet as well as several attacks for each of the networks. Our results demonstrate empirically that adversarial samples across several attacks have similar properties in their contributions to the principal components of neural network inputs. We propose a new metric for neural networks to measure their robustness to adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
