The effectiveness of feature attribution methods and its correlation with automatic evaluation scores
Giang Nguyen, Daeyoung Kim, Anh Nguyen

TL;DR
This study evaluates the real-world effectiveness of feature attribution methods in aiding human decision-making in image classification tasks, revealing poor correlation with automatic metrics and limited practical benefits.
Contribution
It provides the first user study assessing attribution map effectiveness in aiding humans, highlighting the disconnect with automatic evaluation metrics and the need for better testing in human-in-the-loop scenarios.
Findings
Attribution maps are not more effective than nearest training-set examples.
Presenting attribution maps can harm performance in fine-grained classification.
Automatic evaluation metrics poorly correlate with human-AI team performance.
Abstract
Explaining the decisions of an Artificial Intelligence (AI) model is increasingly critical in many real-world, high-stake applications. Hundreds of papers have either proposed new feature attribution methods, discussed or harnessed these tools in their work. However, despite humans being the target end-users, most attribution methods were only evaluated on proxy automatic-evaluation metrics (Zhang et al. 2018; Zhou et al. 2016; Petsiuk et al. 2018). In this paper, we conduct the first user study to measure attribution map effectiveness in assisting humans in ImageNet classification and Stanford Dogs fine-grained classification, and when an image is natural or adversarial (i.e., contains adversarial perturbations). Overall, feature attribution is surprisingly not more effective than showing humans nearest training-set examples. On a harder task of fine-grained dog categorization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
