Gradient-free Post-hoc Explainability Using Distillation Aided Learnable   Approach

Debarpan Bhattacharya; Amir H. Poorjam; Deepak Mittal; Sriram; Ganapathy

arXiv:2409.11123·cs.AI·September 18, 2024

Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach

Debarpan Bhattacharya, Amir H. Poorjam, Deepak Mittal, Sriram, Ganapathy

PDF

Open Access 1 Repo

TL;DR

This paper introduces DAX, a gradient-free, post-hoc explainability framework that uses distillation and learnable masks to generate saliency explanations for black-box models across image and audio data.

Contribution

The paper presents a novel, model-agnostic approach combining mask learning and distillation for explainability without gradient access, outperforming existing methods.

Findings

01

DAX significantly outperforms 9 existing methods across multiple evaluation metrics.

02

DAX effectively generates salient regions in both image and audio modalities.

03

The joint optimization of mask and distillation networks improves explanation quality.

Abstract

The recent advancements in artificial intelligence (AI), with the release of several large models having only query access, make a strong case for explainability of deep models in a post-hoc gradient free manner. In this paper, we propose a framework, named distillation aided explainability (DAX), that attempts to generate a saliency-based explanation in a model agnostic gradient free application. The DAX approach poses the problem of explanation in a learnable setting with a mask generation network and a distillation network. The mask generation network learns to generate the multiplier mask that finds the salient regions of the input, while the student distillation network aims to approximate the local behavior of the black-box model. We propose a joint optimization of the two networks in the DAX framework using the locally perturbed input samples, with the targets derived from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

iiscleap/dax
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Speech Recognition and Synthesis · Seismology and Earthquake Studies

MethodsSparse Evolutionary Training