Rethinking Pruning for Backdoor Mitigation: An Optimization Perspective
Nan Li, Haiyang Yu, Ping Yi

TL;DR
This paper introduces a novel approach using Graph Neural Networks and Reinforcement Learning to optimize neuron pruning for backdoor mitigation in deep neural networks, achieving state-of-the-art results with minimal performance loss.
Contribution
It is the first to employ GNN and RL for optimizing pruning policies specifically for backdoor defense in DNNs.
Findings
Effective backdoor neuron removal with minimal accuracy loss.
Achieves state-of-the-art backdoor mitigation performance.
Requires only a small amount of clean data.
Abstract
Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attacks, posing concerning threats to their reliable deployment. Recent research reveals that backdoors can be erased from infected DNNs by pruning a specific group of neurons, while how to effectively identify and remove these backdoor-associated neurons remains an open challenge. Most of the existing defense methods rely on defined rules and focus on neuron's local properties, ignoring the exploration and optimization of pruning policies. To address this gap, we propose an Optimized Neuron Pruning (ONP) method combined with Graph Neural Network (GNN) and Reinforcement Learning (RL) to repair backdoor models. Specifically, ONP first models the target DNN as graphs based on neuron connectivity, and then uses GNN-based RL agents to learn graph embeddings and find a suitable pruning policy. To the best of our knowledge,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Safety Systems Engineering in Autonomy · Formal Methods in Verification
MethodsSparse Evolutionary Training · Focus · Pruning · Graph Neural Network
