Searching for the Essence of Adversarial Perturbations
Dennis Y. Menn, Tzu-hsun Feng, and Hung-yi Lee

TL;DR
This paper reveals that adversarial perturbations contain human-recognizable features, which are key to understanding, explaining, and defending against adversarial attacks on neural networks.
Contribution
It demonstrates that adversarial perturbations include human-recognizable information, challenging previous beliefs and enabling better interpretability and defense strategies.
Findings
Adversarial perturbations contain human-recognizable features.
Transferability of adversarial examples is explained by shared human-recognizable traits.
Identifies masking and generation as properties of adversarial perturbations.
Abstract
Neural networks have demonstrated state-of-the-art performance in various machine learning fields. However, the introduction of malicious perturbations in input data, known as adversarial examples, has been shown to deceive neural network predictions. This poses potential risks for real-world applications such as autonomous driving and text identification. In order to mitigate these risks, a comprehensive understanding of the mechanisms underlying adversarial examples is essential. In this study, we demonstrate that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's incorrect prediction, in contrast to the widely held belief that human-unidentifiable characteristics play a critical role in fooling a network. This concept of human-recognizable characteristics enables us to explain key features of adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)
