The Manifold Hypothesis for Gradient-Based Explanations

Sebastian Bordt; Uddeshya Upadhyay; Zeynep Akata; Ulrike von Luxburg

arXiv:2206.07387·cs.LG·July 16, 2024·1 cites

The Manifold Hypothesis for Gradient-Based Explanations

Sebastian Bordt, Uddeshya Upadhyay, Zeynep Akata, Ulrike von Luxburg

PDF

Open Access 1 Repo

TL;DR

This paper proposes that gradient-based explanations are more perceptually meaningful when aligned with the data manifold's tangent space, supported by experiments across multiple datasets and methods.

Contribution

It introduces a framework using variational autoencoders to estimate data manifolds and demonstrates the importance of alignment for explanation quality.

Findings

01

Aligned attributions are more perceptually meaningful.

02

Popular explanation methods like Integrated Gradients are more aligned than raw gradients.

03

Adversarial training enhances gradient alignment with the data manifold.

Abstract

When do gradient-based explanation algorithms provide perceptually-aligned explanations? We propose a criterion: the feature attributions need to be aligned with the tangent space of the data manifold. To provide evidence for this hypothesis, we introduce a framework based on variational autoencoders that allows to estimate and generate image manifolds. Through experiments across a range of different datasets -- MNIST, EMNIST, CIFAR10, X-ray pneumonia and Diabetic Retinopathy detection -- we demonstrate that the more a feature attribution is aligned with the tangent space of the data, the more perceptually-aligned it tends to be. We then show that the attributions provided by popular post-hoc methods such as Integrated Gradients and SmoothGrad are more strongly aligned with the data manifold than the raw gradient. Adversarial training also improves the alignment of model gradients with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tml-tuebingen/explanations-manifold
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Explainable Artificial Intelligence (XAI) · Machine Learning in Healthcare

MethodsALIGN