The Gaussian Discriminant Variational Autoencoder (GdVAE): A   Self-Explainable Model with Counterfactual Explanations

Anselm Haselhoff; Kevin Trelenberg; Fabian K\"uppers; Jonas Schneider

arXiv:2409.12952·cs.CV·September 20, 2024

The Gaussian Discriminant Variational Autoencoder (GdVAE): A Self-Explainable Model with Counterfactual Explanations

Anselm Haselhoff, Kevin Trelenberg, Fabian K\"uppers, Jonas Schneider

PDF

Open Access 1 Repo

TL;DR

The GdVAE model combines a self-explainable variational autoencoder with counterfactual explanations, providing transparent and high-quality hypothetical scenario generation for image classification.

Contribution

It introduces a novel GdVAE framework that integrates a generative classifier with counterfactual explanations, enhancing transparency and explanation quality in image models.

Findings

01

Produces high-quality counterfactual explanations

02

Maintains full transparency through generative classification

03

Outperforms existing methods in explanation effectiveness

Abstract

Visual counterfactual explanation (CF) methods modify image concepts, e.g, shape, to change a prediction to a predefined outcome while closely resembling the original query image. Unlike self-explainable models (SEMs) and heatmap techniques, they grant users the ability to examine hypothetical "what-if" scenarios. Previous CF methods either entail post-hoc training, limiting the balance between transparency and CF quality, or demand optimization during inference. To bridge the gap between transparent SEMs and CF methods, we introduce the GdVAE, a self-explainable model based on a conditional variational autoencoder (CVAE), featuring a Gaussian discriminant analysis (GDA) classifier and integrated CF explanations. Full transparency is achieved through a generative classifier that leverages class-specific prototypes for the downstream task and a closed-form solution for CFs in the latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

trustinai/gdvaecode
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications

MethodsHeatmap · Conditional Variational Auto Encoder · Counterfactuals Explanations