Test-Time Backdoor Defense via Detecting and Repairing

Jiyang Guan; Jian Liang; Ran He

arXiv:2308.06107·cs.CR·November 30, 2023

Test-Time Backdoor Defense via Detecting and Repairing

Jiyang Guan, Jian Liang, Ran He

PDF

Open Access

TL;DR

This paper introduces TTBD, a test-time defense method that detects and repairs backdoor attacks in neural networks using partially poisoned data, without requiring clean data beforehand.

Contribution

The paper proposes a novel two-stage test-time defense approach combining poisoned sample detection and neuron pruning to remove backdoors.

Findings

01

Successfully removes backdoors across various models and datasets.

02

Effective detection of poisoned samples with DDP method.

03

Neuron pruning effectively eliminates backdoor triggers.

Abstract

Deep neural networks have played a crucial part in many critical domains, such as autonomous driving, face recognition, and medical diagnosis. However, deep neural networks are facing security threats from backdoor attacks and can be manipulated into attacker-decided behaviors by the backdoor attacker. To defend the backdoor, prior research has focused on using clean data to remove backdoor attacks before model deployment. In this paper, we investigate the possibility of defending against backdoor attacks at test time by utilizing partially poisoned data to remove the backdoor from the model. To address the problem, a two-stage method Test-Time Backdoor Defense (TTBD) is proposed. In the first stage, we propose a backdoor sample detection method DDP to identify poisoned samples from a batch of mixed, partially poisoned samples. Once the poisoned samples are detected, we employ Shapley…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications