Feature compression is the root cause of adversarial fragility in neural network classifiers
Jingchao Gao, Ziqing Lu, Raghu Mudumbai, Xiaodong Wu, Jirong Yi, Myung Cho, Catherine Xu, Hui Xie, Weiyu Xu

TL;DR
This paper analyzes why deep neural networks are vulnerable to adversarial attacks, revealing that feature compression causes fragility and robustness decreases with input dimension, supported by theoretical and experimental evidence.
Contribution
It provides a matrix-theoretic explanation for adversarial fragility, linking feature compression to robustness degradation in neural networks.
Findings
Adversarial robustness decreases as input dimension increases.
Neural networks' robustness is approximately 1/√d of optimal classifiers.
Theoretical results align with experiments on ImageNet data.
Abstract
In this paper, we uniquely study the adversarial robustness of deep neural networks (NN) for classification tasks against that of optimal classifiers. We look at the smallest magnitude of possible additive perturbations that can change a classifier's output. We provide a matrix-theoretic explanation of the adversarial fragility of deep neural networks for classification. In particular, our theoretical results show that a neural network's adversarial robustness can degrade as the input dimension increases. Analytically, we show that neural networks' adversarial robustness can be only of the best possible adversarial robustness of optimal classifiers. Our theories match remarkably well with numerical experiments of practically trained NN, including NN for ImageNet images. The matrix-theoretic explanation is consistent with an earlier information-theoretic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
