A Rigorous Study of Integrated Gradients Method and Extensions to Internal Neuron Attributions
Daniel Lundstrom, Tianjian Huang, Meisam Razaviyayn

TL;DR
This paper critically examines the Integrated Gradients attribution method, clarifies its theoretical foundations, extends its applicability to internal neuron attributions, and introduces a new efficient method for identifying influential neurons.
Contribution
It clarifies the theoretical basis of IG, addresses its sensitivity properties, extends axioms to distribution-based baselines, and proposes a novel efficient neuron attribution method.
Findings
Identified key differences in function spaces affecting IG's uniqueness.
Established conditions under which IG is Lipschitz continuous.
Developed a new efficient method for internal neuron attribution.
Abstract
As deep learning (DL) efficacy grows, concerns for poor model explainability grow also. Attribution methods address the issue of explainability by quantifying the importance of an input feature for a model prediction. Among various methods, Integrated Gradients (IG) sets itself apart by claiming other methods failed to satisfy desirable axioms, while IG and methods like it uniquely satisfy said axioms. This paper comments on fundamental aspects of IG and its applications/extensions: 1) We identify key differences between IG function spaces and the supporting literature's function spaces which problematize previous claims of IG uniqueness. We show that with the introduction of an additional axiom, \textit{non-decreasing positivity}, the uniqueness claims can be established. 2) We address the question of input sensitivity by identifying function classes where IG is/is not Lipschitz in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Cell Image Analysis Techniques
