Jacobian Norm with Selective Input Gradient Regularization for Improved   and Interpretable Adversarial Defense

Deyin Liu; Lin Wu; Haifeng Zhao; Farid Boussaid; Mohammed Bennamoun,; Xianghua Xie

arXiv:2207.13036·cs.LG·November 15, 2022

Jacobian Norm with Selective Input Gradient Regularization for Improved and Interpretable Adversarial Defense

Deyin Liu, Lin Wu, Haifeng Zhao, Farid Boussaid, Mohammed Bennamoun,, Xianghua Xie

PDF

Open Access

TL;DR

This paper introduces J-SIGR, a novel regularization method that enhances neural network robustness against adversarial attacks while also improving interpretability of the model's predictions.

Contribution

The work proposes a new Jacobian-based regularization technique that simultaneously boosts adversarial robustness and interpretability of deep neural networks.

Findings

01

J-SIGR improves robustness against transferred adversarial attacks.

02

The method produces more interpretable neural network predictions.

03

Experiments across various architectures validate effectiveness.

Abstract

Deep neural networks (DNNs) are known to be vulnerable to adversarial examples that are crafted with imperceptible perturbations, i.e., a small change in an input image can induce a mis-classification, and thus threatens the reliability of deep learning based deployment systems. Adversarial training (AT) is often adopted to improve robustness through training a mixture of corrupted and clean data. However, most of AT based methods are ineffective in dealing with transferred adversarial examples which are generated to fool a wide spectrum of defense models, and thus cannot satisfy the generalization requirement raised in real-world scenarios. Moreover, adversarially training a defense model in general cannot produce interpretable predictions towards the inputs with perturbations, whilst a highly interpretable robust model is required by different domain experts to understand the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Machine Learning in Materials Science