PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection
Wei Li, Pin-Yu Chen, Sijia Liu, Ren Wang

TL;DR
The paper introduces PSBD, a novel backdoor detection method that leverages prediction shift uncertainty caused by neuron bias effects, requiring minimal unlabeled data and achieving state-of-the-art results.
Contribution
Proposes PSBD, a new backdoor detection technique based on prediction shift phenomenon and uncertainty measurement, with minimal data requirements.
Findings
PSBD effectively detects backdoor samples with high accuracy.
It outperforms existing detection methods in experiments.
Requires minimal unlabeled validation data.
Abstract
Deep neural networks are susceptible to backdoor attacks, where adversaries manipulate model predictions by inserting malicious samples into the training data. Currently, there is still a significant challenge in identifying suspicious training data to unveil potential backdoor samples. In this paper, we propose a novel method, Prediction Shift Backdoor Detection (PSBD), leveraging an uncertainty-based approach requiring minimal unlabeled clean validation data. PSBD is motivated by an intriguing Prediction Shift (PS) phenomenon, where poisoned models' predictions on clean data often shift away from true labels towards certain other labels with dropout applied during inference, while backdoor samples exhibit less PS. We hypothesize PS results from the neuron bias effect, making neurons favor features of certain classes. PSBD identifies backdoor training samples by computing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications
MethodsDropout
