Reliable Post hoc Explanations: Modeling Uncertainty in Explainability

Dylan Slack; Sophie Hilgard; Sameer Singh; Himabindu Lakkaraju

arXiv:2008.05030·cs.LG·November 9, 2021·23 cites

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability

Dylan Slack, Sophie Hilgard, Sameer Singh, Himabindu Lakkaraju

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a Bayesian framework for local explanations that provides uncertainty estimates, improving the reliability, consistency, and efficiency of explanation methods like LIME and KernelSHAP in high-stakes settings.

Contribution

It develops Bayesian versions of LIME and KernelSHAP that output credible intervals, addressing instability and hyper-parameter tuning issues in existing methods.

Findings

01

Explanations include credible intervals for feature importance.

02

The framework improves stability and consistency of explanations.

03

Theoretical analysis enhances sampling efficiency.

Abstract

As black box explanations are increasingly being employed to establish model credibility in high-stakes settings, it is important to ensure that these explanations are accurate and reliable. However, prior work demonstrates that explanations generated by state-of-the-art techniques are inconsistent, unstable, and provide very little insight into their correctness and reliability. In addition, these methods are also computationally inefficient, and require significant hyper-parameter tuning. In this paper, we address the aforementioned challenges by developing a novel Bayesian framework for generating local explanations along with their associated uncertainty. We instantiate this framework to obtain Bayesian versions of LIME and KernelSHAP which output credible intervals for the feature importances, capturing the associated uncertainty. The resulting explanations not only enable us to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dylan-slack/modeling-uncertainty-local-explainability
pytorch

Videos

Reliable Post hoc Explanations: Modeling Uncertainty in Explainability· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsLocal Interpretable Model-Agnostic Explanations