SCAAT: Improving Neural Network Interpretability via Saliency   Constrained Adaptive Adversarial Training

Rui Xu; Wenkang Qin; Peixiang Huang; Hao Wang; Lin Luo

arXiv:2311.05143·cs.CV·November 13, 2023·1 cites

SCAAT: Improving Neural Network Interpretability via Saliency Constrained Adaptive Adversarial Training

Rui Xu, Wenkang Qin, Peixiang Huang, Hao Wang, Lin Luo

PDF

Open Access

TL;DR

SCAAT is a novel training method that enhances neural network interpretability by generating clearer, more faithful saliency maps through adversarial training guided by saliency, without changing model architecture.

Contribution

The paper introduces SCAAT, a model-agnostic adversarial training approach that improves saliency map quality and interpretability of DNNs without architectural modifications.

Findings

01

Saliency maps become sparser and less noisy.

02

Interpretability improves across multiple datasets.

03

Predictive accuracy remains unaffected.

Abstract

Deep Neural Networks (DNNs) are expected to provide explanation for users to understand their black-box predictions. Saliency map is a common form of explanation illustrating the heatmap of feature attributions, but it suffers from noise in distinguishing important features. In this paper, we propose a model-agnostic learning method called Saliency Constrained Adaptive Adversarial Training (SCAAT) to improve the quality of such DNN interpretability. By constructing adversarial samples under the guidance of saliency map, SCAAT effectively eliminates most noise and makes saliency maps sparser and more faithful without any modification to the model architecture. We apply SCAAT to multiple DNNs and evaluate the quality of the generated saliency maps on various natural and pathological image datasets. Evaluations on different domains and metrics show that SCAAT significantly improves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification

MethodsHeatmap