Four Axiomatic Characterizations of the Integrated Gradients Attribution Method
Daniel Lundstrom, Meisam Razaviyayn

TL;DR
This paper provides four axiomatic characterizations of the Integrated Gradients method, establishing its uniqueness among attribution methods that satisfy specific principles, thereby deepening understanding of its theoretical foundations.
Contribution
It introduces four distinct axiomatic characterizations of Integrated Gradients, demonstrating its unique properties and theoretical underpinnings in model attribution.
Findings
IG is the unique method satisfying each set of axioms
Four different axiomatic characterizations of IG are established
Theoretical insights into the properties of attribution methods
Abstract
Deep neural networks have produced significant progress among machine learning models in terms of accuracy and functionality, but their inner workings are still largely unknown. Attribution methods seek to shine a light on these "black box" models by indicating how much each input contributed to a model's outputs. The Integrated Gradients (IG) method is a state of the art baseline attribution method in the axiomatic vein, meaning it is designed to conform to particular principles of attributions. We present four axiomatic characterizations of IG, establishing IG as the unique method to satisfy different sets of axioms among a class of attribution methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
