Why Does It Look There? Structured Explanations for Image Classification
Jiarui Li, Zixiang Yin, Samuel J Landry, Zhengming Ding, Ramgopal R. Mettu

TL;DR
This paper introduces I2X, a framework that creates structured, faithful explanations for image classification models by leveraging prototypes and checkpoints, enhancing interpretability and enabling model improvement.
Contribution
I2X is the first method to generate structured explanations directly from unstructured interpretability, linking prototypes to decision processes during training.
Findings
I2X effectively reveals prototype-based inference in models.
It improves prediction accuracy through targeted sample perturbation.
Demonstrates applicability on MNIST and CIFAR10 datasets.
Abstract
Deep learning models achieve remarkable predictive performance, yet their black-box nature limits transparency and trustworthiness. Although numerous explainable artificial intelligence (XAI) methods have been proposed, they primarily provide saliency maps or concepts (i.e., unstructured interpretability). Existing approaches often rely on auxiliary models (\eg, GPT, CLIP) to describe model behavior, thereby compromising faithfulness to the original models. We propose Interpretability to Explainability (I2X), a framework that builds structured explanations directly from unstructured interpretability by quantifying progress at selected checkpoints during training using prototypes extracted from post-hoc XAI methods (e.g., GradCAM). I2X answers the question of "why does it look there" by providing a structured view of both intra- and inter-class decision making during training.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications
