Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability

Joakim Edin; Andreas Geert Motzfeldt; Casper L. Christensen; Tuukka Ruotsalo; Lars Maal{\o}e; Maria Maistro

arXiv:2408.08137·cs.LG·May 26, 2025

Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability

Joakim Edin, Andreas Geert Motzfeldt, Casper L. Christensen, Tuukka Ruotsalo, Lars Maal{\o}e, Maria Maistro

PDF

Open Access 1 Repo

TL;DR

This paper introduces Normalized AOPC, a new metric for feature attribution faithfulness in neural networks, addressing limitations of the traditional AOPC by enabling reliable cross-model comparisons and interpretation.

Contribution

It proposes a normalization method for AOPC, improving the reliability and interpretability of faithfulness evaluations across different models.

Findings

01

Normalization radically changes AOPC results

02

Previous conclusions based on AOPC may be unreliable

03

Normalized AOPC provides a more robust evaluation framework

Abstract

Deep neural network predictions are notoriously difficult to interpret. Feature attribution methods aim to explain these predictions by identifying the contribution of each input feature. Faithfulness, often evaluated using the area over the perturbation curve (AOPC), reflects feature attributions' accuracy in describing the internal mechanisms of deep neural networks. However, many studies rely on AOPC to compare faithfulness across different models, which we show can lead to false conclusions about models' faithfulness. Specifically, we find that AOPC is sensitive to variations in the model, resulting in unreliable cross-model comparisons. Moreover, AOPC scores are difficult to interpret in isolation without knowing the model-specific lower and upper limits. To address these issues, we propose a normalization approach, Normalized AOPC (NAOPC), enabling consistent cross-model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joakimedin/naopc
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Machine Learning in Healthcare