Counterfactual Explanation Based on Gradual Construction for Deep   Networks

Hong-Gyu Jung; Sin-Han Kang; Hee-Dong Kim; Dong-Ok Won; Seong-Whan Lee

arXiv:2008.01897·cs.LG·August 15, 2022

Counterfactual Explanation Based on Gradual Construction for Deep Networks

Hong-Gyu Jung, Sin-Han Kang, Hee-Dong Kim, Dong-Ok Won, Seong-Whan Lee

PDF

TL;DR

This paper introduces a novel counterfactual explanation method for deep networks that gradually constructs explanations by iteratively selecting and optimizing features, resulting in clearer, more human-friendly interpretations aligned with training data distributions.

Contribution

The proposed method uniquely combines masking and composition steps to produce more realistic and understandable counterfactual explanations based on training data statistics.

Findings

01

Produces human-friendly interpretations across various datasets

02

Achieves explanations with fewer feature modifications

03

Verifies alignment with training data distribution

Abstract

To understand the black-box characteristics of deep networks, counterfactual explanation that deduces not only the important features of an input space but also how those features should be modified to classify input as a target class has gained an increasing interest. The patterns that deep networks have learned from a training dataset can be grasped by observing the feature variation among various classes. However, current approaches perform the feature modification to increase the classification probability for the target class irrespective of the internal characteristics of deep networks. This often leads to unclear explanations that deviate from real-world data distributions. To address this problem, we propose a counterfactual explanation method that exploits the statistics learned from a training dataset. Especially, we gradually construct an explanation by iterating over masking…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.