Good-looking but Lacking Faithfulness: Understanding Local Explanation   Methods through Trend-based Testing

Jinwen He; Kai Chen; Guozhu Meng; Jiangshan Zhang; Congyi Li

arXiv:2309.05679·cs.LG·September 13, 2023

Good-looking but Lacking Faithfulness: Understanding Local Explanation Methods through Trend-based Testing

Jinwen He, Kai Chen, Guozhu Meng, Jiangshan Zhang, Congyi Li

PDF

1 Repo

TL;DR

This paper introduces trend-based faithfulness tests for local explanation methods, demonstrating their superiority over traditional tests in assessing explanations of complex data across various tasks, thereby enhancing model debugging.

Contribution

It proposes novel trend-based faithfulness tests that improve the evaluation of explanation methods, especially for complex data, and demonstrates their effectiveness through extensive empirical evaluation.

Findings

01

Traditional faithfulness tests are dominated by randomness, especially on complex data.

02

Trend-based tests better assess explanation faithfulness across image, language, and security tasks.

03

Model debugging benefits significantly from the improved faithfulness evaluation.

Abstract

While enjoying the great achievements brought by deep learning (DL), people are also worried about the decision made by DL models, since the high degree of non-linearity of DL models makes the decision extremely difficult to understand. Consequently, attacks such as adversarial attacks are easy to carry out, but difficult to detect and explain, which has led to a boom in the research on local explanation methods for explaining model decisions. In this paper, we evaluate the faithfulness of explanation methods and find that traditional tests on faithfulness encounter the random dominance problem, \ie, the random selection performs the best, especially for complex data. To further solve this problem, we propose three trend-based faithfulness tests and empirically demonstrate that the new trend tests can better assess faithfulness than traditional tests on image, natural language and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jenniferho97/xai-trend-test
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.