On Gradient-like Explanation under a Black-box Setting: When Black-box   Explanations Become as Good as White-box

Yi Cai; Gerhard Wunder

arXiv:2308.09381·cs.LG·May 15, 2024

On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-box

Yi Cai, Gerhard Wunder

PDF

Open Access 1 Repo

TL;DR

This paper introduces extmethodAbr, a gradient-estimation-based explanation method that provides high-quality feature attributions using only query access, bridging the gap between black-box and white-box explanations.

Contribution

It proposes a novel gradient estimation approach for explanations that requires no internal model access, with rigorous theoretical properties and strong empirical performance.

Findings

01

Outperforms existing black-box attribution methods

02

Achieves competitive results with white-box gradient-based methods

03

Provides theoretically guaranteed explanation quality

Abstract

Attribution methods shed light on the explainability of data-driven approaches such as deep learning models by uncovering the most influential features in a to-be-explained decision. While determining feature attributions via gradients delivers promising results, the internal access required for acquiring gradients can be impractical under safety concerns, thus limiting the applicability of gradient-based approaches. In response to such limited flexibility, this paper presents \methodAbr~(gradient-estimation-based explanation), an approach that produces gradient-like explanations through only query-level access. The proposed approach holds a set of fundamental properties for attribution methods, which are mathematically rigorously proved, ensuring the quality of its explanations. In addition to the theoretical analysis, with a focus on image data, the experimental results empirically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

caiy0220/geex
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Radiomics and Machine Learning in Medical Imaging · Adversarial Robustness in Machine Learning

MethodsFocus