Explainers in the Wild: Making Surrogate Explainers Robust to Distortions through Perception
Alexander Hepburn, Raul Santos-Rodriguez

TL;DR
This paper introduces a method to improve the robustness of surrogate explainers in image classification by incorporating perceptual distances, ensuring explanations remain consistent despite image distortions in real-world scenarios.
Contribution
The authors propose a novel approach that embeds perceptual distances into surrogate explainers to enhance their robustness against image distortions in real-world data.
Findings
Perceptual distances improve explanation coherence for distorted images.
Surrogate explainers become more reliable with the proposed methodology.
Method tested on ImageNet-C dataset with positive results.
Abstract
Explaining the decisions of models is becoming pervasive in the image processing domain, whether it is by using post-hoc methods or by creating inherently interpretable models. While the widespread use of surrogate explainers is a welcome addition to inspect and understand black-box models, assessing the robustness and reliability of the explanations is key for their success. Additionally, whilst existing work in the explainability field proposes various strategies to address this problem, the challenges of working with data in the wild is often overlooked. For instance, in image classification, distortions to images can not only affect the predictions assigned by the model, but also the explanation. Given a clean and a distorted version of an image, even if the prediction probabilities are similar, the explanation may still be different. In this paper we propose a methodology to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
