GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks

Jacob Sanderson; Hua Mao; Wai Lok Woo

arXiv:2603.15373·cs.LG·March 17, 2026·IEEE Trans. Artif. Intell.

GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks

Jacob Sanderson, Hua Mao, Wai Lok Woo

PDF

Open Access

TL;DR

GradCFA is a novel hybrid explanation method that combines counterfactuals and feature attribution to enhance local interpretability of neural networks, supporting multi-class scenarios and balancing feasibility, plausibility, and diversity.

Contribution

It introduces a hybrid framework that unifies counterfactual explanations and feature attribution, extending applicability to multi-class models and improving interpretability qualities.

Findings

01

Effectively generates feasible and plausible counterfactuals.

02

Provides valuable feature attribution insights.

03

Outperforms state-of-the-art methods in key interpretability metrics.

Abstract

Explainable Artificial Intelligence (XAI) is increasingly essential as AI systems are deployed in critical fields such as healthcare and finance, offering transparency into AI-driven decisions. Two major XAI paradigms, counterfactual explanations (CFX) and feature attribution (FA), serve distinct roles in model interpretability. This study introduces GradCFA, a hybrid framework combining CFX and FA to improve interpretability by explicitly optimizing feasibility, plausibility, and diversity - key qualities often unbalanced in existing methods. Unlike most CFX research focused on binary classification, GradCFA extends to multi-class scenarios, supporting a wider range of applications. We evaluate GradCFA's validity, proximity, sparsity, plausibility, and diversity against state-of-the-art methods, including Wachter, DiCE, CARE for CFX, and SHAP for FA. Results show GradCFA effectively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Artificial Intelligence in Healthcare and Education · Machine Learning in Healthcare