Explaining Classifiers with Causal Concept Effect (CaCE)
Yash Goyal, Amir Feder, Uri Shalit, Been Kim

TL;DR
This paper introduces CaCE, a causal measure for understanding neural network decisions based on human concepts, using VAEs to estimate causal effects and avoid confounding errors.
Contribution
It proposes CaCE as a causal explanation method for neural networks and demonstrates how VAEs can effectively estimate these causal effects in complex datasets.
Findings
VAE-CaCE accurately estimates true concept causal effects.
CaCE outperforms correlation-based methods in avoiding confounding.
Effective on high-dimensional image datasets.
Abstract
How can we understand classification decisions made by deep neural networks? Many existing explainability methods rely solely on correlations and fail to account for confounding, which may result in potentially misleading explanations. To overcome this problem, we define the Causal Concept Effect (CaCE) as the causal effect of (the presence or absence of) a human-interpretable concept on a deep neural net's predictions. We show that the CaCE measure can avoid errors stemming from confounding. Estimating CaCE is difficult in situations where we cannot easily simulate the do-operator. To mitigate this problem, we use a generative model, specifically a Variational AutoEncoder (VAE), to measure VAE-CaCE. In an extensive experimental analysis, we show that the VAE-CaCE is able to estimate the true concept causal effect, compared to baselines for a number of datasets including high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
MethodsSolana Customer Service Number +1-833-534-1729
