"Is your explanation stable?": A Robustness Evaluation Framework for   Feature Attribution

Yuyou Gan; Yuhao Mao; Xuhong Zhang; Shouling Ji; Yuwen Pu; Meng Han,; Jianwei Yin; Ting Wang

arXiv:2209.01782·cs.AI·September 7, 2022·1 cites

"Is your explanation stable?": A Robustness Evaluation Framework for Feature Attribution

Yuyou Gan, Yuhao Mao, Xuhong Zhang, Shouling Ji, Yuwen Pu, Meng Han,, Jianwei Yin, Ting Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces MeTFA, a model-agnostic framework that enhances the stability and robustness of feature attribution explanations for neural networks by quantifying uncertainty and reducing instability.

Contribution

It proposes a novel median test-based method to evaluate feature importance significance and confidence intervals, improving explanation stability and robustness against noise and adversarial attacks.

Findings

01

MeTFA significantly reduces explanation instability.

02

It improves the visual quality and faithfulness of explanations.

03

MeTFA enhances robustness against explanation attacks.

Abstract

Understanding the decision process of neural networks is hard. One vital method for explanation is to attribute its decision to pivotal features. Although many algorithms are proposed, most of them solely improve the faithfulness to the model. However, the real environment contains many random noises, which may leads to great fluctuations in the explanations. More seriously, recent works show that explanation algorithms are vulnerable to adversarial attacks. All of these make the explanation hard to trust in real scenarios. To bridge this gap, we propose a model-agnostic method \emph{Median Test for Feature Attribution} (MeTFA) to quantify the uncertainty and increase the stability of explanation algorithms with theoretical guarantees. MeTFA has the following two functions: (1) examine whether one feature is significantly important or unimportant and generate a MeTFA-significant map…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sweet-shark/MeTFA-A-Robustness-Evaluation-Framework-for-Feature-Attribution
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications

MethodsTest