CSC: Turning the Adversary's Poison against Itself
Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu, Bo Liu, Wanlei Zhou

TL;DR
This paper introduces CSC, a novel defense against poisoning-based backdoor attacks that effectively detects and suppresses poisoned samples by leveraging early training dynamics and clustering, significantly reducing attack success rates.
Contribution
The paper presents a new poison suppression method, Cluster Segregation Concealment (CSC), which identifies poisoned samples early and relabels them to maintain model accuracy while eliminating backdoors.
Findings
CSC reduces attack success rates to near zero across multiple datasets.
CSC outperforms nine state-of-the-art defenses in empirical evaluations.
Minimal accuracy loss on clean data demonstrates CSC's effectiveness and robustness.
Abstract
Poisoning-based backdoor attacks pose significant threats to deep neural networks by embedding triggers in training data, causing models to misclassify triggered inputs as adversary-specified labels while maintaining performance on clean data. Existing poison restraint-based defenses often suffer from inadequate detection against specific attack variants and compromise model utility through unlearning methods that lead to accuracy degradation. This paper conducts a comprehensive analysis of backdoor attack dynamics during model training, revealing that poisoned samples form isolated clusters in latent space early on, with triggers acting as dominant features distinct from benign ones. Leveraging these insights, we propose Cluster Segregation Concealment (CSC), a novel poison suppression defense. CSC first trains a deep neural network via standard supervised learning while segregating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
