Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI

Won Jun Kim; Hyungjin Chung; Jaemin Kim; Sangmin Lee; Byeongsu Sim; Jong Chul Ye

arXiv:2411.15265·cs.CV·July 22, 2025

Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAI

Won Jun Kim, Hyungjin Chung, Jaemin Kim, Sangmin Lee, Byeongsu Sim, Jong Chul Ye

PDF

Open Access 1 Repo

TL;DR

This paper introduces FreeMCG, a derivative-free, manifold-constrained gradient approximation method that improves explainability of neural networks by requiring only output access and producing more faithful, human-aligned attributions.

Contribution

The paper presents FreeMCG, a novel derivative-free approach using ensemble Kalman filters and diffusion models for better model explanations on the data manifold.

Findings

01

Achieves state-of-the-art results in counterfactual generation.

02

Provides more faithful and human-aligned feature attributions.

03

Requires only model output access, enhancing black-box applicability.

Abstract

Gradient-based methods are a prototypical family of explainability techniques, especially for image-based models. Nonetheless, they have several shortcomings in that they (1) require white-box access to models, (2) are vulnerable to adversarial attacks, and (3) produce attributions that lie off the image manifold, leading to explanations that are not actually faithful to the model and do not align well with human perception. To overcome these challenges, we introduce Derivative-Free Diffusion Manifold-Constrainted Gradients (FreeMCG), a novel method that serves as an improved basis for explainability of a given neural network than the traditional gradient. Specifically, by leveraging ensemble Kalman filters and diffusion models, we derive a derivative-free approximation of the model's gradient projected onto the data manifold, requiring access only to the model's outputs. We demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

openxaiproject/pnpxai
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Numerical methods in inverse problems · Model Reduction and Neural Networks

MethodsDiffusion · ALIGN