PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li; Pin-Yu Chen; Sijia Liu; Ren Wang

arXiv:2406.05826·cs.LG·April 17, 2025

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li, Pin-Yu Chen, Sijia Liu, Ren Wang

PDF

Open Access 1 Repo

TL;DR

The paper introduces PSBD, a novel backdoor detection method that leverages prediction shift uncertainty caused by neuron bias effects, requiring minimal unlabeled data and achieving state-of-the-art results.

Contribution

Proposes PSBD, a new backdoor detection technique based on prediction shift phenomenon and uncertainty measurement, with minimal data requirements.

Findings

01

PSBD effectively detects backdoor samples with high accuracy.

02

It outperforms existing detection methods in experiments.

03

Requires minimal unlabeled validation data.

Abstract

Deep neural networks are susceptible to backdoor attacks, where adversaries manipulate model predictions by inserting malicious samples into the training data. Currently, there is still a significant challenge in identifying suspicious training data to unveil potential backdoor samples. In this paper, we propose a novel method, Prediction Shift Backdoor Detection (PSBD), leveraging an uncertainty-based approach requiring minimal unlabeled clean validation data. PSBD is motivated by an intriguing Prediction Shift (PS) phenomenon, where poisoned models' predictions on clean data often shift away from true labels towards certain other labels with dropout applied during inference, while backdoor samples exhibit less PS. We hypothesize PS results from the neuron bias effect, making neurons favor features of certain classes. PSBD identifies backdoor training samples by computing the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wl-619/psbd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging Techniques and Applications

MethodsDropout