NoiseGrad: Enhancing Explanations by Introducing Stochasticity to Model Weights
Kirill Bykov, Anna Hedstr\"om, Shinichi Nakajima, Marina M.-C. H\"ohne

TL;DR
NoiseGrad introduces stochasticity into neural network weights to improve local and global explanations, outperforming existing methods like SmoothGrad by perturbing decision boundaries rather than just input data.
Contribution
The paper proposes NoiseGrad, a novel method that enhances explanation quality by adding stochasticity to model weights, extending the idea of input noise perturbation.
Findings
NoiseGrad significantly outperforms baseline explanation methods.
FusionGrad combines NoiseGrad and SmoothGrad for further improvement.
Both methods are model-agnostic and easy to use without fine-tuning.
Abstract
Many efforts have been made for revealing the decision-making process of black-box learning machines such as deep neural networks, resulting in useful local and global explanation methods. For local explanation, stochasticity is known to help: a simple method, called SmoothGrad, has improved the visual quality of gradient-based attribution by adding noise to the input space and averaging the explanations of the noisy inputs. In this paper, we extend this idea and propose NoiseGrad that enhances both local and global explanation methods. Specifically, NoiseGrad introduces stochasticity in the weight parameter space, such that the decision boundary is perturbed. NoiseGrad is expected to enhance the local explanation, similarly to SmoothGrad, due to the dual relationship between the input perturbation and the decision boundary perturbation. We evaluate NoiseGrad and its fusion with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
