Rigorous Feature Importance Scores based on Shapley Value and Banzhaf Index
Xuanxiang Huang, Olivier L\'etoff\'e, Joao Marques-Silva

TL;DR
This paper introduces new feature importance scores based on game theory that consider non-WAXp sets, improving explanations of ML models by accounting for their role in excluding adversarial examples.
Contribution
It proposes two novel feature importance scores using Shapley value and Banzhaf index that incorporate non-WAXp sets, enhancing feature attribution in XAI.
Findings
Scores effectively quantify feature contribution in excluding adversarial examples
Properties and computational complexity of the scores are analyzed
Scores outperform existing methods in certain explanation tasks
Abstract
Feature attribution methods based on game theory are ubiquitous in the field of eXplainable Artificial Intelligence (XAI). Recent works proposed rigorous feature attribution using logic-based explanations, specifically targeting high-stakes uses of machine learning (ML) models. Typically, such works exploit weak abductive explanation (WAXp) as the characteristic function to assign importance to features. However, one possible downside is that the contribution of non-WAXp sets is neglected. In fact, non-WAXp sets can also convey important information, because of the relationship between formal explanations (XPs) and adversarial examples (AExs). Accordingly, this paper leverages Shapley value and Banzhaf index to devise two novel feature importance scores. We take into account non-WAXp sets when computing feature contribution, and the novel scores quantify how effective each feature is at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTechnology and Data Analysis
