Constructing sensible baselines for Integrated Gradients

Jai Bardhan; Cyrin Neeraj; Mihir Rawat; Subhadip Mitra

arXiv:2412.13864·cs.LG·December 19, 2024

Constructing sensible baselines for Integrated Gradients

Jai Bardhan, Cyrin Neeraj, Mihir Rawat, Subhadip Mitra

PDF

Open Access

TL;DR

This paper explores how to improve the interpretability of black box machine learning models using Integrated Gradients by designing better baselines, demonstrated through a particle physics case study.

Contribution

It introduces the idea of using averaged background event baselines instead of zero-vector baselines for more meaningful feature attributions in IGs.

Findings

01

Zero-vector baseline is ineffective for feature attribution.

02

Averaged background baselines yield more reasonable explanations.

03

Method improves interpretability of ML models in scientific applications.

Abstract

Machine learning methods have seen a meteoric rise in their applications in the scientific community. However, little effort has been put into understanding these "black box" models. We show how one can apply integrated gradients (IGs) to understand these models by designing different baselines, by taking an example case study in particle physics. We find that the zero-vector baseline does not provide good feature attributions and that an averaged baseline sampled from the background events provides consistently more reasonable attributions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHigher Education Learning Practices · Intelligent Tutoring Systems and Adaptive Learning