TL;DR
This paper introduces a novel explanation model for image classifiers that combines global and local knowledge through class association embedding, enabling more accurate and interpretable explanations via counter-factual sample generation.
Contribution
The study proposes a new generative explanation model using class association embedding that improves global knowledge extraction and explanation accuracy for image classifiers.
Findings
Achieves higher explanation accuracy than state-of-the-art methods
Effectively visualizes classification decision patterns
Enables counter-factual generation for individual sample explanations
Abstract
Image classification is a primary task in data analysis where explainable models are crucially demanded in various applications. Although amounts of methods have been proposed to obtain explainable knowledge from the black-box classifiers, these approaches lack the efficiency of extracting global knowledge regarding the classification task, thus is vulnerable to local traps and often leads to poor accuracy. In this study, we propose a generative explanation model that combines the advantages of global and local knowledge for explaining image classifiers. We develop a representation learning method called class association embedding (CAE), which encodes each sample into a pair of separated class-associated and individual codes. Recombining the individual code of a given sample with altered class-associated code leads to a synthetic real-looking sample with preserved individual characters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
