Do Not Trust Additive Explanations
Alicja Gosiewska, Przemyslaw Biecek

TL;DR
This paper critically examines the faithfulness of additive explanations like LIME and SHAP in complex models, introduces a new interaction detection method, and benchmarks their reliability in the presence of feature interactions.
Contribution
It introduces a novel method to detect interactions in instance-level explanations and evaluates the reliability of additive explanations in non-additive models.
Findings
Additive explanations can be misleading in models with feature interactions
The new interaction detection method effectively identifies when explanations are unreliable
Benchmark results show frequent discrepancies between explanations and true model behavior
Abstract
Explainable Artificial Intelligence (XAI)has received a great deal of attention recently. Explainability is being presented as a remedy for the distrust of complex and opaque models. Model agnostic methods such as LIME, SHAP, or Break Down promise instance-level interpretability for any complex machine learning model. But how faithful are these additive explanations? Can we rely on additive explanations for non-additive models? In this paper, we (1) examine the behavior of the most popular instance-level explanations under the presence of interactions, (2) introduce a new method that detects interactions for instance-level explanations, (3) perform a large scale benchmark to see how frequently additive explanations may be misleading.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsInterpretability · Shapley Additive Explanations · Local Interpretable Model-Agnostic Explanations
