FRInGe: Distribution-Space Integrated Gradients with Fisher--Rao Geometry
Gabriele Martino, Sebastian Tschiatschek

TL;DR
FRInGe introduces a Fisher-Rao geometry-based integrated gradients method that improves attribution explanations by using a distribution-space approach, enhancing calibration metrics across multiple architectures.
Contribution
It proposes a novel Fisher-Rao geodesic-based attribution method that replaces heuristics with a distribution-space interpolation for more reliable explanations.
Findings
FRInGe improves calibration-oriented attribution metrics on ImageNet models.
It remains competitive on perturbation AUC and infidelity metrics.
The method stabilizes attributions using KL and Euclidean trust regions.
Abstract
Gradient-based attribution methods are model-faithful and scalable, but Integrated Gradients (IG) can be brittle because explanations depend on heuristic baselines, straight-line paths, discretization, and saturation. We propose Fisher--Rao Integrated Gradients (FRInGe), which defines both the reference and interpolation schedule in predictive distribution space. FRInGe replaces input baselines with a maximum-entropy predictive reference and follows a Fisher-Rao geodesic on the probability simplex. The corresponding input-space trajectory is realized through the pullback Fisher metric and stabilized by KL and Euclidean trust regions; attributions are obtained by integrating input gradients along this trajectory. Across six ImageNet architectures, FRInGe most clearly improves calibration-oriented attribution metrics, especially MAS scores, while remaining competitive on perturbation AUC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
