Meta-evaluating stability measures: MAX-Senstivity & AVG-Sensitivity

Miquel Mir\'o-Nicolau; Antoni Jaume-i-Cap\'o; Gabriel Moy\`a-Alcover

arXiv:2412.10942·cs.CV·December 17, 2024

Meta-evaluating stability measures: MAX-Senstivity & AVG-Sensitivity

Miquel Mir\'o-Nicolau, Antoni Jaume-i-Cap\'o, Gabriel Moy\`a-Alcover

PDF

1 Repo

TL;DR

This paper critically assesses the reliability of existing stability measures in XAI, revealing their limitations through novel meta-evaluation tests and highlighting the need for more robust evaluation metrics.

Contribution

It introduces two new tests for meta-evaluating stability measures and demonstrates their unreliability in distinguishing random explanations from meaningful ones.

Findings

01

Existing stability metrics fail to identify random explanations.

02

The proposed tests reveal the unreliability of AVG-Sensitivity and MAX-Sensitivity.

03

Stability measures need improved evaluation methods.

Abstract

The use of eXplainable Artificial Intelligence (XAI) systems has introduced a set of challenges that need resolution. The XAI robustness, or stability, has been one of the goals of the community from its beginning. Multiple authors have proposed evaluating this feature using objective evaluation measures. Nonetheless, many questions remain. With this work, we propose a novel approach to meta-evaluate these metrics, i.e. analyze the correctness of the evaluators. We propose two new tests that allowed us to evaluate two different stability measures: AVG-Sensitiviy and MAX-Senstivity. We tested their reliability in the presence of perfect and robust explanations, generated with a Decision Tree; as well as completely random explanations and prediction. The metrics results showed their incapacity of identify as erroneous the random explanations, highlighting their overall unreliability.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

explainingai/stability
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training