Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition
Lintong Zhang, Kang Yin, Seong-Whan Lee

TL;DR
This paper introduces a novel fine-grained counterfactual explanation framework that enhances interpretability of misclassification in visual models by providing object- and part-level insights, surpassing existing methods.
Contribution
It proposes a non-generative, saliency partition-based approach for detailed, region-specific interpretability of model misclassifications at object and part levels.
Findings
Outperforms existing fine-grained explanation methods.
Provides more intuitive and detailed local feature insights.
Effectively isolates region-specific feature relevance.
Abstract
Attribution-based explanation techniques capture key patterns to enhance visual interpretability; however, these patterns often lack the granularity needed for insight in fine-grained tasks, particularly in cases of model misclassification, where explanations may be insufficiently detailed. To address this limitation, we propose a fine-grained counterfactual explanation framework that generates both object-level and part-level interpretability, addressing two fundamental questions: (1) which fine-grained features contribute to model misclassification, and (2) where dominant local features influence counterfactual adjustments. Our approach yields explainable counterfactuals in a non-generative manner by quantifying similarity and weighting component contributions within regions of interest between correctly classified and misclassified samples. Furthermore, we introduce a saliency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
