Shapley-Guided Neural Repair Approach via Derivative-Free Optimization
Xinyu Sun, Wanwei Liu, Haoang Chi, Tingyu Chen, Xiaoguang Mao, Shangwen Wang, Lei Bu, Jingyi Wang, Yang Tan, and Zhenyi Qi

TL;DR
SHARPEN is a novel neural repair method combining interpretable fault localization with derivative-free optimization, effectively repairing DNNs against backdoors, adversarial attacks, and unfairness without relying on gradients.
Contribution
It introduces a hierarchical localization strategy using Deep SHAP and employs CMA-ES for gradient-free neuron repair, enhancing generalizability and interpretability.
Findings
Outperforms baselines in backdoor removal (+10.56%)
Achieves significant improvements in adversarial mitigation (+5.78%)
Effectively repairs unfairness (+11.82%)
Abstract
DNNs are susceptible to defects like backdoors, adversarial attacks, and unfairness, undermining their reliability. Existing approaches mainly involve retraining, optimization, constraint-solving, or search algorithms. However, most methods rely on gradient calculations, restricting applicability to specific activation functions (e.g., ReLU), or use search algorithms with uninterpretable localization and repair. Furthermore, they often lack generalizability across multiple properties. We propose SHARPEN, integrating interpretable fault localization with a derivative-free optimization strategy. First, SHARPEN introduces a Deep SHAP-based localization strategy quantifying each layer's and neuron's marginal contribution to erroneous outputs. Specifically, a hierarchical coarse-to-fine approach reranks layers by aggregated impact, then locates faulty neurons/filters by analyzing activation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
