A Bayesian explanation of machine learning models based on modes and functional ANOVA
Quan Long

TL;DR
This paper introduces a Bayesian approach to explain deviations in model predictions by identifying influential features through ANOVA decomposition, offering more intuitive and robust explanations than traditional mean-based methods.
Contribution
It presents a novel Bayesian inverse explanation method that leverages ANOVA decomposition to identify features causing prediction deviations, improving interpretability and robustness.
Findings
More human-intuitive explanations than SHAP values
Robustness to deviations in label values
Dimension-independent computational costs
Abstract
Most methods in explainable AI (XAI) focus on providing reasons for the prediction of a given set of features. However, we solve an inverse explanation problem, i.e., given the deviation of a label, find the reasons of this deviation. We use a Bayesian framework to recover the ``true'' features, conditioned on the observed label value. We efficiently explain the deviation of a label value from the mode, by identifying and ranking the influential features using the ``distances'' in the ANOVA functional decomposition. We show that the new method is more human-intuitive and robust than methods based on mean values, e.g., SHapley Additive exPlanations (SHAP values). The extra costs of solving a Bayesian inverse problem are dimension-independent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsSparse Evolutionary Training · Focus
