What Do You See? Evaluation of Explainable Artificial Intelligence (XAI)   Interpretability through Neural Backdoors

Yi-Shan Lin; Wen-Chuan Lee; Z. Berkay Celik

arXiv:2009.10639·cs.CV·September 23, 2020·27 cites

What Do You See? Evaluation of Explainable Artificial Intelligence (XAI) Interpretability through Neural Backdoors

Yi-Shan Lin, Wen-Chuan Lee, Z. Berkay Celik

PDF

Open Access

TL;DR

This paper proposes using backdoor trigger patterns as ground truth to automate and improve the evaluation of XAI methods' interpretability, revealing limitations of current approaches in identifying important input regions.

Contribution

It introduces a novel backdoor-based evaluation framework and metrics for assessing XAI explanations, demonstrating its effectiveness across multiple models and explanation methods.

Findings

01

Model-free methods outperform local explanation methods in identifying trigger regions.

02

Six explanation methods fail to fully highlight backdoor triggers.

03

Backdoor triggers serve as ground truth for evaluating explanation relevance.

Abstract

EXplainable AI (XAI) methods have been proposed to interpret how a deep neural network predicts inputs through model saliency explanations that highlight the parts of the inputs deemed important to arrive a decision at a specific target. However, it remains challenging to quantify correctness of their interpretability as current evaluation approaches either require subjective input from humans or incur high computation cost with automated evaluation. In this paper, we propose backdoor trigger patterns--hidden malicious functionalities that cause misclassification--to automate the evaluation of saliency explanations. Our key observation is that triggers provide ground truth for inputs to evaluate whether the regions identified by an XAI method are truly relevant to its output. Since backdoor triggers are the most important features that cause deliberate misclassification, a robust XAI…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications

MethodsInterpretability