Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution

Soyeon Kim; Seongwoo Lim; Kyowoon Lee; Jaesik Choi

arXiv:2605.02167·cs.LG·May 19, 2026

Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution

Soyeon Kim, Seongwoo Lim, Kyowoon Lee, Jaesik Choi

PDF

1 Repo

TL;DR

The paper introduces MA-GIG, a novel method that constructs feature attribution paths in a learned latent space to produce more reliable and faithful explanations for deep neural networks.

Contribution

It proposes a manifold-aligned approach to Guided Integrated Gradients using a variational autoencoder to bias attribution paths toward the data manifold.

Findings

01

MA-GIG reduces off-manifold noise in attributions.

02

It outperforms prior path-based attribution methods.

03

Produces more faithful explanations across datasets.

Abstract

Feature attribution is central to diagnosing and trusting deep neural networks, and Integrated Gradients (IG) is widely used due to its axiomatic properties. However, IG can yield unreliable explanations when the integration path between a baseline and the input passes through regions with noisy gradients. While Guided Integrated Gradients reduces this sensitivity by adaptively updating low-gradient-magnitude features, input-space guidance still produces intermediate inputs that deviate from the data manifold. To address this limitation, we propose \emph{Manifold-Aligned Guided Integrated Gradients} (MA-GIG), which constructs attribution paths in the latent space of a pre-trained variational autoencoder. By decoding intermediate latent states, MA-GIG biases the path toward the learned generative manifold and reduces exposure to implausible input-space regions. Through qualitative and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leekwoon/ma-gig
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.