On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations
Amir Mehrpanah, Matteo Gamba, Kevin Smith, Hossein Azizpour

TL;DR
This paper introduces a spectral framework to analyze the trade-off between smoothness and faithfulness in gradient-based explanations of ReLU networks, revealing how smoothing methods can distort explanations and proposing ways to quantify and regularize this effect.
Contribution
It provides a unifying spectral analysis framework for explanation smoothness and faithfulness, and quantifies the explanation gap caused by surrogate smoothing methods in ReLU networks.
Findings
Spectral analysis characterizes the smoothness-faithfulness trade-off.
Smoothing explanations can distort true model attributions.
The framework enables quantification and regularization of explanation fidelity.
Abstract
ReLU networks, while prevalent for visual data, have sharp transitions, sometimes relying on individual pixels for predictions, making vanilla gradient-based explanations noisy and difficult to interpret. Existing methods, such as GradCAM, smooth these explanations by producing surrogate models at the cost of faithfulness. We introduce a unifying spectral framework to systematically analyze and quantify smoothness, faithfulness, and their trade-off in explanations. Using this framework, we quantify and regularize the contribution of ReLU networks to high-frequency information, providing a principled approach to identifying this trade-off. Our analysis characterizes how surrogate-based smoothing distorts explanations, leading to an ``explanation gap'' that we formally define and measure for different post-hoc methods. Finally, we validate our theoretical findings across different design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
