Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

Panagiota Kiourti; Anu Singh; Preeti Duraipandian; Weichao Zhou; and Wenchao Li

arXiv:2512.06665·cs.LG·December 9, 2025

Rethinking Robustness: A New Approach to Evaluating Feature Attribution Methods

Panagiota Kiourti, Anu Singh, Preeti Duraipandian, Weichao Zhou, and Wenchao Li

PDF

Open Access

TL;DR

This paper proposes a new framework for evaluating the robustness of feature attribution methods in neural networks, emphasizing the need for objective metrics that accurately reflect attribution method weaknesses.

Contribution

It introduces a novel robustness metric, a new definition of similar inputs, and a GAN-based method for generating these inputs, improving evaluation accuracy.

Findings

01

Existing metrics are insufficient for true attribution robustness assessment

02

A new robustness metric better captures attribution method weaknesses

03

GAN-based input generation enhances evaluation of attribution methods

Abstract

This paper studies the robustness of feature attribution methods for deep neural networks. It challenges the current notion of attributional robustness that largely ignores the difference in the model's outputs and introduces a new way of evaluating the robustness of attribution methods. Specifically, we propose a new definition of similar inputs, a new robustness metric, and a novel method based on generative adversarial networks to generate these inputs. In addition, we present a comprehensive evaluation with existing metrics and state-of-the-art attribution methods. Our findings highlight the need for a more objective metric that reveals the weaknesses of an attribution method rather than that of the neural network, thus providing a more accurate evaluation of the robustness of attribution methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI