Backdoor Defense via Decoupling the Training Process
Kunzhe Huang, Yiming Li, Baoyuan Wu, Zhan Qin, Kui Ren

TL;DR
This paper introduces a novel backdoor defense method that decouples training into self-supervised backbone learning, supervised fine-tuning, and semi-supervised adjustment, effectively reducing backdoor risks while maintaining high accuracy.
Contribution
It proposes a new three-stage training decoupling approach leveraging self-supervised learning to improve backdoor defense in DNNs.
Findings
Effective in reducing backdoor threats across multiple datasets.
Preserves high accuracy on benign samples.
Outperforms existing defense methods.
Abstract
Recent studies have revealed that deep neural networks (DNNs) are vulnerable to backdoor attacks, where attackers embed hidden backdoors in the DNN model by poisoning a few training samples. The attacked model behaves normally on benign samples, whereas its prediction will be maliciously changed when the backdoor is activated. We reveal that poisoned samples tend to cluster together in the feature space of the attacked DNN model, which is mostly due to the end-to-end supervised training paradigm. Inspired by this observation, we propose a novel backdoor defense via decoupling the original end-to-end training process into three stages. Specifically, we first learn the backbone of a DNN model via \emph{self-supervised learning} based on training samples without their labels. The learned backbone will map samples with the same ground-truth label to similar locations in the feature space.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Network Security and Intrusion Detection
