TL;DR
This paper introduces a comprehensive evaluation framework for XAI methods and proposes PGCA, a novel attribution technique that combines perturbation importance with gradient-based methods, demonstrating superior performance across multiple domains.
Contribution
The paper presents a unified multi-criteria evaluation framework and a new attribution method, PGCA, enhancing fidelity, interpretability, and fairness in explainable AI.
Findings
PGCA outperforms baselines in fidelity, interpretability, and fairness metrics.
The evaluation framework effectively ranks XAI methods across diverse domains.
Code and results are publicly available for reproducibility.
Abstract
Explainable Artificial Intelligence (XAI) methods are increasingly used in safety-critical domains, yet there is no unified framework to jointly evaluate fidelity, interpretability, robustness, fairness, and completeness. We address this gap through two contributions. First, we propose a multi-criteria evaluation framework that formalizes these five criteria using principled metrics: fidelity via prediction-gap analysis; interpretability via a composite concentration-coherence-contrast score; robustness via cosine-similarity perturbation stability; fairness via Jensen-Shannon divergence across demographic groups; and completeness via feature-ablation coverage. These are integrated using an entropy-weighted dynamic scoring scheme that adapts to domain-specific priorities. Second, we introduce Perturbation-Gradient Consensus Attribution (PGCA), which fuses grid-based perturbation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
