Constructing sensible baselines for Integrated Gradients
Jai Bardhan, Cyrin Neeraj, Mihir Rawat, Subhadip Mitra

TL;DR
This paper explores how to improve the interpretability of black box machine learning models using Integrated Gradients by designing better baselines, demonstrated through a particle physics case study.
Contribution
It introduces the idea of using averaged background event baselines instead of zero-vector baselines for more meaningful feature attributions in IGs.
Findings
Zero-vector baseline is ineffective for feature attribution.
Averaged background baselines yield more reasonable explanations.
Method improves interpretability of ML models in scientific applications.
Abstract
Machine learning methods have seen a meteoric rise in their applications in the scientific community. However, little effort has been put into understanding these "black box" models. We show how one can apply integrated gradients (IGs) to understand these models by designing different baselines, by taking an example case study in particle physics. We find that the zero-vector baseline does not provide good feature attributions and that an averaged baseline sampled from the background events provides consistently more reasonable attributions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHigher Education Learning Practices · Intelligent Tutoring Systems and Adaptive Learning
