Uncertainty Quantification of Surrogate Explanations: an Ordinal Consensus Approach
Jonas Schulz, Rafael Poyiadzi, Raul Santos-Rodriguez

TL;DR
This paper introduces a method to quantify the uncertainty of surrogate explanations for black-box models by measuring ordinal consensus among diverse explainers, aiding practitioners in assessing explanation trustworthiness.
Contribution
It proposes a novel ensemble-based approach to estimate explanation uncertainty and introduces metrics for aggregating diverse surrogate explanations.
Findings
Uncertainty estimates improve trust in explanations.
The approach effectively identifies unreliable explanations.
Visualizations provide actionable insights beyond standard methods.
Abstract
Explainability of black-box machine learning models is crucial, in particular when deployed in critical applications such as medicine or autonomous cars. Existing approaches produce explanations for the predictions of models, however, how to assess the quality and reliability of such explanations remains an open question. In this paper we take a step further in order to provide the practitioner with tools to judge the trustworthiness of an explanation. To this end, we produce estimates of the uncertainty of a given explanation by measuring the ordinal consensus amongst a set of diverse bootstrapped surrogate explainers. While we encourage diversity by using ensemble techniques, we propose and analyse metrics to aggregate the information contained within the set of explainers through a rating scheme. We empirically illustrate the properties of this approach through experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning
