Attribution for Enhanced Explanation with Transferable Adversarial   eXploration

Zhiyu Zhu; Jiayu Zhang; Zhibo Jin; Huaming Chen; Jianlong Zhou; Fang; Chen

arXiv:2412.19523·cs.AI·December 30, 2024

Attribution for Enhanced Explanation with Transferable Adversarial eXploration

Zhiyu Zhu, Jiayu Zhang, Zhibo Jin, Huaming Chen, Jianlong Zhou, Fang, Chen

PDF

Open Access

TL;DR

AttEXplore++ significantly improves deep neural network interpretability by integrating transferable adversarial attack methods, leading to more accurate, robust, and stable model explanations across various architectures and datasets.

Contribution

This paper introduces AttEXplore++, an enhanced attribution framework that incorporates transferable adversarial attacks like MIG and GRA, improving explanation accuracy and robustness.

Findings

01

Achieves 7.57% average improvement over AttEXplore

02

Improves interpretability scores by 32.62% over other methods

03

Provides more stable explanations across different models

Abstract

The interpretability of deep neural networks is crucial for understanding model decisions in various applications, including computer vision. AttEXplore++, an advanced framework built upon AttEXplore, enhances attribution by incorporating transferable adversarial attack methods such as MIG and GRA, significantly improving the accuracy and robustness of model explanations. We conduct extensive experiments on five models, including CNNs (Inception-v3, ResNet-50, VGG16) and vision transformers (MaxViT-T, ViT-B/16), using the ImageNet dataset. Our method achieves an average performance improvement of 7.57\% over AttEXplore and 32.62\% compared to other state-of-the-art interpretability algorithms. Using insertion and deletion scores as evaluation metrics, we show that adversarial transferability plays a vital role in enhancing attribution results. Furthermore, we explore the impact of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning