TL;DR
This paper introduces a causal feature selection method for explaining black-box visual classifiers by identifying input features with the greatest causal influence on model output, resulting in more interpretable and salient explanations.
Contribution
It proposes a novel causal extension to instance-wise feature selection using Relative Entropy Distance, linking it to conditional mutual information for improved interpretability.
Findings
Selected features are sparser and more salient.
Method improves post-hoc accuracy on vision datasets.
Features have higher Average Causal Effect on model output.
Abstract
We formulate a causal extension to the recently introduced paradigm of instance-wise feature selection to explain black-box visual classifiers. Our method selects a subset of input features that has the greatest causal effect on the models output. We quantify the causal influence of a subset of features by the Relative Entropy Distance measure. Under certain assumptions this is equivalent to the conditional mutual information between the selected subset and the output variable. The resulting causal selections are sparser and cover salient objects in the scene. We show the efficacy of our approach on multiple vision datasets by measuring the post-hoc accuracy and Average Causal Effect of selected features on the models output.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFeature Selection
