Four Axiomatic Characterizations of the Integrated Gradients Attribution   Method

Daniel Lundstrom; Meisam Razaviyayn

arXiv:2306.13753·cs.LG·June 27, 2023·2 cites

Four Axiomatic Characterizations of the Integrated Gradients Attribution Method

Daniel Lundstrom, Meisam Razaviyayn

PDF

Open Access

TL;DR

This paper provides four axiomatic characterizations of the Integrated Gradients method, establishing its uniqueness among attribution methods that satisfy specific principles, thereby deepening understanding of its theoretical foundations.

Contribution

It introduces four distinct axiomatic characterizations of Integrated Gradients, demonstrating its unique properties and theoretical underpinnings in model attribution.

Findings

01

IG is the unique method satisfying each set of axioms

02

Four different axiomatic characterizations of IG are established

03

Theoretical insights into the properties of attribution methods

Abstract

Deep neural networks have produced significant progress among machine learning models in terms of accuracy and functionality, but their inner workings are still largely unknown. Attribution methods seek to shine a light on these "black box" models by indicating how much each input contributed to a model's outputs. The Integrated Gradients (IG) method is a state of the art baseline attribution method in the axiomatic vein, meaning it is designed to conform to particular principles of attributions. We present four axiomatic characterizations of IG, establishing IG as the unique method to satisfy different sets of axioms among a class of attribution methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning