Improving Feature Attribution through Input-specific Network Pruning
Ashkan Khakzar, Soroosh Baselizadeh, Saurabh Khanduja, Christian, Rupprecht, Seong Tae Kim, Nassir Navab

TL;DR
This paper introduces an input-specific neural network pruning method that enhances feature attribution by producing more accurate and fine-grained importance maps, outperforming existing gradient-based methods.
Contribution
The paper presents a novel input-specific pruning technique that improves neural network interpretability by reducing noise and increasing the precision of attribution maps.
Findings
Input-specific pruning shifts gradients from local to global importance.
The method produces more detailed attribution maps.
It outperforms existing attribution methods across multiple benchmarks.
Abstract
Attributing the output of a neural network to the contribution of given input elements is a way of shedding light on the black-box nature of neural networks. Due to the complexity of current network architectures, current gradient-based attribution methods provide very noisy or coarse results. We propose to prune a neural network for a given single input to keep only neurons that highly contribute to the prediction. We show that by input-specific pruning, network gradients change from reflecting local (noisy) importance information to global importance. Our proposed method is efficient and generates fine-grained attribution maps. We further provide a theoretical justification of the pruning approach relating it to perturbations and validate it through a novel experimental setup. Our method is evaluated by multiple benchmarks: sanity checks, pixel perturbation, and Remove-and-Retrain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Explainable Artificial Intelligence (XAI)
MethodsPruning
