Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information
Jiebao Zhang, Wenhua Qian, Rencan Nie, Jinde Cao, Dan Xu

TL;DR
This paper investigates how normal and adversarially trained CNNs differ in information extraction, revealing that adversarial training leads CNNs to focus more on shape-based features, which impacts their robustness to adversarial examples.
Contribution
It provides a mutual information perspective on the differences between normally trained and adversarially trained CNNs, highlighting their distinct feature preferences and the implications for adversarial robustness.
Findings
Adversarial training reduces the amount of information CNNs extract from inputs.
NT-CNNs prefer texture-based features, while AT-CNNs focus on shape-based features.
Adversarial examples contain more texture information, misleading CNNs.
Abstract
A counter-intuitive property of convolutional neural networks (CNNs) is their inherent susceptibility to adversarial examples, which severely hinders the application of CNNs in security-critical fields. Adversarial examples are similar to original examples but contain malicious perturbations. Adversarial training is a simple and effective defense method to improve the robustness of CNNs to adversarial examples. The mechanisms behind adversarial examples and adversarial training are worth exploring. Therefore, this work investigates similarities and differences between normally trained CNNs (NT-CNNs) and adversarially trained CNNs (AT-CNNs) in information extraction from the mutual information perspective. We show that 1) whether NT-CNNs or AT-CNNs, for original and adversarial examples, the trends towards mutual information are almost similar throughout training; 2) compared with normal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security
