Concept-Based Abductive and Contrastive Explanations for Behaviors of Vision Models
Ronaldo Canizales, Divya Gopinath, Corina P\u{a}s\u{a}reanu, Ravi Mangal

TL;DR
This paper introduces a novel approach combining concept-based explanations with abductive and contrastive methods to identify high-level concepts causally relevant for vision model predictions.
Contribution
It proposes a new framework for concept-based abductive and contrastive explanations, enabling causal understanding of model behaviors at the concept level.
Findings
Effectively identifies minimal sets of high-level concepts causally linked to model outcomes.
Can explain model predictions on individual images and collections with common behaviors.
Demonstrates improved interpretability across multiple models and datasets.
Abstract
*Concept-based explanations* offer a promising approach for explaining the predictions of deep neural networks in terms of high-level, human-understandable concepts. However, existing methods either do not establish a causal connection between the concepts and model predictions or are limited in expressivity and only able to infer causal explanations involving single concepts. At the same time, the parallel line of work on *formal abductive and contrastive explanations* computes the minimal set of input features causally relevant for model outcomes but only considers low-level features such as pixels. Merging these two threads, in this work, we propose the notion of *concept-based abductive and contrastive explanations* that capture the minimal sets of high-level concepts causally relevant for model outcomes. We then present a family of algorithms that enumerate all minimal explanations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
