Interpreting Deep Learning Models with Marginal Attribution by Conditioning on Quantiles
M. Merz, R. Richman, T. Tsanakas, M.V. W\"uthrich

TL;DR
This paper introduces MACQ, a global, model-agnostic method for interpreting deep learning models by analyzing how features contribute to predictions across different output levels, enhancing understanding of feature importance and interactions.
Contribution
The paper presents MACQ, a novel marginal attribution method conditioned on quantiles, which separates feature contributions from interactions and visualizes their relationship with output levels.
Findings
MACQ effectively explains feature importance across prediction regions.
MACQ separates marginal effects from interaction effects.
MACQ visualizes the relationship between features, output levels, and contributions.
Abstract
A vastly growing literature on explaining deep learning models has emerged. This paper contributes to that literature by introducing a global gradient-based model-agnostic method, which we call Marginal Attribution by Conditioning on Quantiles (MACQ). Our approach is based on analyzing the marginal attribution of predictions (outputs) to individual features (inputs). Specificalllly, we consider variable importance by mixing (global) output levels and, thus, explain how features marginally contribute across different regions of the prediction space. Hence, MACQ can be seen as a marginal attribution counterpart to approaches such as accumulated local effects (ALE), which study the sensitivities of outputs by perturbing inputs. Furthermore, MACQ allows us to separate marginal attribution of individual features from interaction effect, and visually illustrate the 3-way relationship between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
