Feature salience - not task-informativeness - drives machine learning model explanations
Benedict Clark, Marta Oliveira, Rick Wilming, Stefan Haufe

TL;DR
This study shows that feature importance in machine learning explanations is primarily driven by feature salience, such as visual prominence, rather than the actual informativeness related to the target variable, challenging assumptions in explainable AI.
Contribution
The paper demonstrates that popular attribution methods often highlight salient features regardless of their relevance to the target, urging a reevaluation of XAI practices.
Findings
Feature importance correlates strongly with visual salience rather than informativeness.
Watermarks influence importance attribution regardless of their class dependence.
Attribution methods behave similarly to simple edge detection filters.
Abstract
Explainable AI (XAI) promises to provide insight into machine learning models' decision processes, where one goal is to identify failures such as shortcut learning. This promise relies on the field's assumption that input features marked as important by an XAI must contain information about the target variable. However, it is unclear whether informativeness is indeed the main driver of importance attribution in practice, or if other data properties such as statistical suppression, novelty at test-time, or high feature salience substantially contribute. To clarify this, we trained deep learning models on three variants of a binary image classification task, in which translucent watermarks are either absent, act as class-dependent confounds, or represent class-independent noise. Results for five popular attribution methods show substantially elevated relative importance in watermarked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
