Cartoon Explanations of Image Classifiers
Stefan Kolek, Duc Anh Nguyen, Ron Levie, Joan Bruna, Gitta Kutyniok

TL;DR
CartoonX is a new explanation method for image classifiers that leverages wavelet sparsity to produce more meaningful and concise explanations, especially for misclassified images.
Contribution
It introduces a wavelet-sparsity constraint into the rate-distortion explanation framework, pioneering a model-agnostic approach that emphasizes piece-wise smooth image features.
Findings
CartoonX reveals novel explanatory information for misclassifications.
It achieves lower distortion with fewer coefficients compared to existing methods.
The method effectively captures relevant image regions by exploiting wavelet domain sparsity.
Abstract
We present CartoonX (Cartoon Explanation), a novel model-agnostic explanation method tailored towards image classifiers and based on the rate-distortion explanation (RDE) framework. Natural images are roughly piece-wise smooth signals -- also called cartoon-like images -- and tend to be sparse in the wavelet domain. CartoonX is the first explanation method to exploit this by requiring its explanations to be sparse in the wavelet domain, thus extracting the relevant piece-wise smooth part of an image instead of relevant pixel-sparse regions. We demonstrate that CartoonX can reveal novel valuable explanatory information, particularly for misclassifications. Moreover, we show that CartoonX achieves a lower distortion with fewer coefficients than other state-of-the-art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis
