Test-Time Backdoor Defense via Detecting and Repairing
Jiyang Guan, Jian Liang, Ran He

TL;DR
This paper introduces TTBD, a test-time defense method that detects and repairs backdoor attacks in neural networks using partially poisoned data, without requiring clean data beforehand.
Contribution
The paper proposes a novel two-stage test-time defense approach combining poisoned sample detection and neuron pruning to remove backdoors.
Findings
Successfully removes backdoors across various models and datasets.
Effective detection of poisoned samples with DDP method.
Neuron pruning effectively eliminates backdoor triggers.
Abstract
Deep neural networks have played a crucial part in many critical domains, such as autonomous driving, face recognition, and medical diagnosis. However, deep neural networks are facing security threats from backdoor attacks and can be manipulated into attacker-decided behaviors by the backdoor attacker. To defend the backdoor, prior research has focused on using clean data to remove backdoor attacks before model deployment. In this paper, we investigate the possibility of defending against backdoor attacks at test time by utilizing partially poisoned data to remove the backdoor from the model. To address the problem, a two-stage method Test-Time Backdoor Defense (TTBD) is proposed. In the first stage, we propose a backdoor sample detection method DDP to identify poisoned samples from a batch of mixed, partially poisoned samples. Once the poisoned samples are detected, we employ Shapley…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
