The Gaussian Discriminant Variational Autoencoder (GdVAE): A Self-Explainable Model with Counterfactual Explanations
Anselm Haselhoff, Kevin Trelenberg, Fabian K\"uppers, Jonas Schneider

TL;DR
The GdVAE model combines a self-explainable variational autoencoder with counterfactual explanations, providing transparent and high-quality hypothetical scenario generation for image classification.
Contribution
It introduces a novel GdVAE framework that integrates a generative classifier with counterfactual explanations, enhancing transparency and explanation quality in image models.
Findings
Produces high-quality counterfactual explanations
Maintains full transparency through generative classification
Outperforms existing methods in explanation effectiveness
Abstract
Visual counterfactual explanation (CF) methods modify image concepts, e.g, shape, to change a prediction to a predefined outcome while closely resembling the original query image. Unlike self-explainable models (SEMs) and heatmap techniques, they grant users the ability to examine hypothetical "what-if" scenarios. Previous CF methods either entail post-hoc training, limiting the balance between transparency and CF quality, or demand optimization during inference. To bridge the gap between transparent SEMs and CF methods, we introduce the GdVAE, a self-explainable model based on a conditional variational autoencoder (CVAE), featuring a Gaussian discriminant analysis (GDA) classifier and integrated CF explanations. Full transparency is achieved through a generative classifier that leverages class-specific prototypes for the downstream task and a closed-form solution for CFs in the latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications
MethodsHeatmap · Conditional Variational Auto Encoder · Counterfactuals Explanations
