Pixel-level Certified Explanations via Randomized Smoothing
Alaa Anani, Tobias Lorenz, Mario Fritz, Bernt Schiele

TL;DR
This paper introduces a novel certification framework using randomized smoothing to provide pixel-level robustness guarantees for any black-box attribution method, enhancing trustworthiness and interpretability of explanations in deep learning.
Contribution
It presents the first method to certify pixel-level robustness for attribution maps, reformulating the problem as segmentation and proposing new evaluation metrics.
Findings
Certified attributions are robust against $\
The framework applies to 12 attribution methods across 5 ImageNet models.
The approach improves interpretability and faithfulness of explanations.
Abstract
Post-hoc attribution methods aim to explain deep learning predictions by highlighting influential input pixels. However, these explanations are highly non-robust: small, imperceptible input perturbations can drastically alter the attribution map while maintaining the same prediction. This vulnerability undermines their trustworthiness and calls for rigorous robustness guarantees of pixel-level attribution scores. We introduce the first certification framework that guarantees pixel-level robustness for any black-box attribution method using randomized smoothing. By sparsifying and smoothing attribution maps, we reformulate the task as a segmentation problem and certify each pixel's importance against -bounded perturbations. We further propose three evaluation metrics to assess certified robustness, localization, and faithfulness. An extensive evaluation of 12 attribution methods…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
