From Weight Perturbation to Feature Attribution for Explaining Fully Connected Neural Networks

Thodoris Lymperopoulos; Denia Kanellopoulou

arXiv:2605.15328·cs.LG·May 18, 2026

From Weight Perturbation to Feature Attribution for Explaining Fully Connected Neural Networks

Thodoris Lymperopoulos, Denia Kanellopoulou

PDF

TL;DR

This paper introduces a novel feature attribution method for Fully Connected Neural Networks by perturbing weights instead of features, leading to more reliable explanations and competitive performance.

Contribution

It proposes a new approach to feature attribution through weight perturbation, resulting in two methods, XWP and XWP_c, that improve interpretability of simple DNNs.

Findings

01

XWP and XWP_c achieve competitive results on baseline metrics.

02

Weight perturbation offers a new perspective for attribution, mitigating biases.

03

The methods enhance the robustness and reliability of explanations.

Abstract

Fully Connected Neural Networks (FCNNs) are often regarded as simple and intuitive architectures, yet they serve as the foundation for more complex models. Nonetheless, the lack of consensus on their interpretability continues to pose challenges, underscoring the enduring relevance of simpler, attribution-based approaches for understanding even the most advanced neural architectures. In this regard, we explore a novel idea for estimating feature attribution, by applying perturbation to the features' attached weights instead of their values. This method offers a fresh perspective aimed at mitigating common limitations in Occlusion techniques, such as Added Bias and Out-of-Distribution data. The application of this rule leads to the formation of a pair of novel attribution methods we call XWP and XWP_c. Founded on simple rules, our methods achieve competitive performance in identifying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.