On Spectral Properties of Gradient-based Explanation Methods
Amir Mehrpanah, Erik Englesson, Hossein Azizpour

TL;DR
This paper provides a formal spectral analysis of gradient-based explanation methods for deep networks, identifying biases and proposing remedies to improve explanation consistency and reliability.
Contribution
It introduces a novel probabilistic and spectral framework to analyze explanation methods, revealing spectral biases and proposing solutions like SpectralLens and perturbation scale determination.
Findings
Spectral bias is pervasive in gradient-based explanations.
Using squared gradients and input perturbation affects explanation reliability.
Proposed remedies improve explanation consistency and interpretability.
Abstract
Understanding the behavior of deep networks is crucial to increase our confidence in their results. Despite an extensive body of work for explaining their predictions, researchers have faced reliability issues, which can be attributed to insufficient formalism. In our research, we adopt novel probabilistic and spectral perspectives to formally analyze explanation methods. Our study reveals a pervasive spectral bias stemming from the use of gradient, and sheds light on some common design choices that have been discovered experimentally, in particular, the use of squared gradient and input perturbation. We further characterize how the choice of perturbation hyperparameters in explanation methods, such as SmoothGrad, can lead to inconsistent explanations and introduce two remedies based on our proposed formalism: (i) a mechanism to determine a standard perturbation scale, and (ii) an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
