Extracting Explanations, Justification, and Uncertainty from Black-Box   Deep Neural Networks

Paul Ardis; Arjuna Flenner

arXiv:2403.08652·cs.LG·March 14, 2024·1 cites

Extracting Explanations, Justification, and Uncertainty from Black-Box Deep Neural Networks

Paul Ardis, Arjuna Flenner

PDF

Open Access

TL;DR

This paper introduces a Bayesian method to extract explanations, justifications, and uncertainty estimates from black-box deep neural networks, enhancing their interpretability and reliability without retraining.

Contribution

A novel, efficient Bayesian approach that provides explanations and uncertainty measures for any black-box DNN without retraining.

Findings

01

Improves interpretability of DNNs

02

Enhances reliability in anomaly detection

03

Applicable to out-of-distribution detection

Abstract

Deep Neural Networks (DNNs) do not inherently compute or exhibit empirically-justified task confidence. In mission critical applications, it is important to both understand associated DNN reasoning and its supporting evidence. In this paper, we propose a novel Bayesian approach to extract explanations, justifications, and uncertainty estimates from DNNs. Our approach is efficient both in terms of memory and computation, and can be applied to any black box DNN without any retraining, including applications to anomaly detection and out-of-distribution detection tasks. We validate our approach on the CIFAR-10 dataset, and show that it can significantly improve the interpretability and reliability of DNNs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI)