What You See is What You Classify: Black Box Attributions

Steven Stalder; Nathana\"el Perraudin; Radhakrishna Achanta; Fernando; Perez-Cruz; Michele Volpi

arXiv:2205.11266·cs.CV·August 1, 2024·6 cites

What You See is What You Classify: Black Box Attributions

Steven Stalder, Nathana\"el Perraudin, Radhakrishna Achanta, Fernando, Perez-Cruz, Michele Volpi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel method to generate precise, class-specific attribution masks for deep image classifiers by training a secondary network, improving accuracy and efficiency over existing saliency map techniques.

Contribution

We propose training a secondary network to predict attribution masks for a black-box classifier, enabling sharper, class-specific explanations in a single inference step.

Findings

01

Produces sharper, boundary-precise masks

02

Generates distinct class-specific masks efficiently

03

Outperforms existing methods on PASCAL VOC-2007 and COCO-2014

Abstract

An important step towards explaining deep image classifiers lies in the identification of image regions that contribute to individual class scores in the model's output. However, doing this accurately is a difficult task due to the black-box nature of such networks. Most existing approaches find such attributions either using activations and gradients or by repeatedly perturbing the input. We instead address this challenge by training a second deep network, the Explainer, to predict attributions for a pre-trained black-box classifier, the Explanandum. These attributions are provided in the form of masks that only show the classifier-relevant parts of an image, masking out the rest. Our approach produces sharper and more boundary-precise masks when compared to the saliency maps generated by other methods. Moreover, unlike most existing approaches, ours is capable of directly generating…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

stevenstalder/nn-explainer
pytorchOfficial

Videos

What You See is What You Classify: Black Box Attributions· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis