Feature Attribution with Necessity and Sufficiency via Dual-stage Perturbation Test for Causal Explanation
Xuexin Chen, Ruichu Cai, Zhengting Huang, Yuxuan Zhu, Julien Horwood,, Zhifeng Hao, Zijian Li, Jose Miguel Hernandez-Lobato

TL;DR
This paper introduces FANS, a novel feature attribution method that improves the discrimination of feature importance by estimating necessity and sufficiency through a dual-stage perturbation test, enhancing explainability in machine learning models.
Contribution
FANS is a new approach that combines neighborhood estimation and dual-stage perturbation testing to better identify feature importance in model explanations.
Findings
FANS outperforms existing attribution methods on six benchmarks.
The method effectively distinguishes feature contributions even when perturbation effects are similar.
FANS provides more accurate and interpretable feature importance scores.
Abstract
We investigate the problem of explainability for machine learning models, focusing on Feature Attribution Methods (FAMs) that evaluate feature importance through perturbation tests. Despite their utility, FAMs struggle to distinguish the contributions of different features, when their prediction changes are similar after perturbation. To enhance FAMs' discriminative power, we introduce Feature Attribution with Necessity and Sufficiency (FANS), which find a neighborhood of the input such that perturbing samples within this neighborhood have a high Probability of being Necessity and Sufficiency (PNS) cause for the change in predictions, and use this PNS as the importance of the feature. Specifically, FANS compute this PNS via a heuristic strategy for estimating the neighborhood and a perturbation test involving two stages (factual and interventional) for counterfactual reasoning. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Bayesian Modeling and Causal Inference · Adversarial Robustness in Machine Learning
