Jacobian Norm with Selective Input Gradient Regularization for Improved and Interpretable Adversarial Defense
Deyin Liu, Lin Wu, Haifeng Zhao, Farid Boussaid, Mohammed Bennamoun,, Xianghua Xie

TL;DR
This paper introduces J-SIGR, a novel regularization method that enhances neural network robustness against adversarial attacks while also improving interpretability of the model's predictions.
Contribution
The work proposes a new Jacobian-based regularization technique that simultaneously boosts adversarial robustness and interpretability of deep neural networks.
Findings
J-SIGR improves robustness against transferred adversarial attacks.
The method produces more interpretable neural network predictions.
Experiments across various architectures validate effectiveness.
Abstract
Deep neural networks (DNNs) are known to be vulnerable to adversarial examples that are crafted with imperceptible perturbations, i.e., a small change in an input image can induce a mis-classification, and thus threatens the reliability of deep learning based deployment systems. Adversarial training (AT) is often adopted to improve robustness through training a mixture of corrupted and clean data. However, most of AT based methods are ineffective in dealing with transferred adversarial examples which are generated to fool a wide spectrum of defense models, and thus cannot satisfy the generalization requirement raised in real-world scenarios. Moreover, adversarially training a defense model in general cannot produce interpretable predictions towards the inputs with perturbations, whilst a highly interpretable robust model is required by different domain experts to understand the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science
