Circumventing Backdoor Space via Weight Symmetry
Jie Peng, Hongwei Yang, Jing Zhao, Hengji Dong, Hui He, Weizhe Zhang, Haoyu He

TL;DR
This paper introduces TSC, a backdoor defense method that leverages weight symmetry and permutation invariance, effective across supervised and self-supervised learning without requiring labeled data.
Contribution
The paper proposes TSC, a novel backdoor purification technique based on weight symmetry and quadratic mode connectivity, applicable across various learning paradigms.
Findings
TSC effectively purifies backdoored models with minimal clean data.
TSC maintains high accuracy on clean samples while removing backdoors.
TSC generalizes well to self-supervised learning frameworks.
Abstract
Deep neural networks are vulnerable to backdoor attacks, where malicious behaviors are implanted during training. While existing defenses can effectively purify compromised models, they typically require labeled data or specific training procedures, making them difficult to apply beyond supervised learning settings. Notably, recent studies have shown successful backdoor attacks across various learning paradigms, highlighting a critical security concern. To address this gap, we propose Two-stage Symmetry Connectivity (TSC), a novel backdoor purification defense that operates independently of data format and requires only a small fraction of clean samples. Through theoretical analysis, we prove that by leveraging permutation invariance in neural networks and quadratic mode connectivity, TSC amplifies the loss on poisoned samples while maintaining bounded clean accuracy. Experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Domain Adaptation and Few-Shot Learning
MethodsBitcoin Customer Service Number +1-833-534-1729 · Average Pooling · Kaiming Initialization · Global Average Pooling · Max Pooling · Convolution · Color Jitter · Random Resized Crop · Normalized Temperature-scaled Cross Entropy Loss · Dense Connections
