CBD: A Certified Backdoor Detector Based on Local Dominant Probability
Zhen Xiang, Zidi Xiong, Bo Li

TL;DR
This paper introduces CBD, a certified backdoor detection method for neural networks that guarantees detection under certain conditions and provides probabilistic false positive bounds, showing high effectiveness across multiple datasets.
Contribution
The paper proposes the first certified backdoor detector based on local dominant probability and conformal prediction, offering detection guarantees and false positive bounds.
Findings
CBD achieves high detection accuracy on multiple datasets.
Theoretical guarantees improve detection of resilient backdoor triggers.
Experimental results surpass state-of-the-art detectors in various scenarios.
Abstract
Backdoor attack is a common threat to deep neural networks. During testing, samples embedded with a backdoor trigger will be misclassified as an adversarial target by a backdoored model, while samples without the backdoor trigger will be correctly classified. In this paper, we present the first certified backdoor detector (CBD), which is based on a novel, adjustable conformal prediction scheme based on our proposed statistic local dominant probability. For any classifier under inspection, CBD provides 1) a detection inference, 2) the condition under which the attacks are guaranteed to be detectable for the same classification domain, and 3) a probabilistic upper bound for the false positive rate. Our theoretical results show that attacks with triggers that are more resilient to test-time noise and have smaller perturbation magnitudes are more likely to be detected with guarantees.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
