TL;DR
This paper introduces a novel probabilistic framework for anomaly attribution in black-box regression models, enabling uncertainty quantification of input variable attributions without requiring access to training data.
Contribution
It proposes a generative perturbation-based method and a variational Bayes algorithm to compute both attribution scores and their uncertainties, addressing limitations of existing deviation-agnostic methods.
Findings
First probabilistic anomaly attribution framework.
Quantifies uncertainty in attribution scores.
Effective in black-box regression scenarios.
Abstract
We address the task of probabilistic anomaly attribution in the black-box regression setting, where the goal is to compute the probability distribution of the attribution score of each input variable, given an observed anomaly. The training dataset is assumed to be unavailable. This task differs from the standard XAI (explainable AI) scenario, since we wish to explain the anomalous deviation from a black-box prediction rather than the black-box model itself. We begin by showing that mainstream model-agnostic explanation methods, such as the Shapley values, are not suitable for this task because of their ``deviation-agnostic property.'' We then propose a novel framework for probabilistic anomaly attribution that allows us to not only compute attribution scores as the predictive mean but also quantify the uncertainty of those scores. This is done by considering a generative process for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
