Training set cleansing of backdoor poisoning by self-supervised representation learning
H. Wang, S. Karami, O. Dia, H. Ritter, E. Emamjomeh-Zadeh, J. Chen, Z., Xiang, D.J. Miller, G. Kesidis

TL;DR
This paper proposes a novel data cleansing method using self-supervised representation learning to effectively mitigate backdoor poisoning attacks in image classification, outperforming existing techniques.
Contribution
It introduces a self-supervised learning-based approach for backdoor defense that combines sample filtering and re-labeling, achieving state-of-the-art results.
Findings
Effective backdoor mitigation on CIFAR-10
Outperforms existing defense methods
Utilizes self-supervised embeddings to identify poisoned samples
Abstract
A backdoor or Trojan attack is an important type of data poisoning attack against deep neural network (DNN) classifiers, wherein the training dataset is poisoned with a small number of samples that each possess the backdoor pattern (usually a pattern that is either imperceptible or innocuous) and which are mislabeled to the attacker's target class. When trained on a backdoor-poisoned dataset, a DNN behaves normally on most benign test samples but makes incorrect predictions to the target class when the test sample has the backdoor pattern incorporated (i.e., contains a backdoor trigger). Here we focus on image classification tasks and show that supervised training may build stronger association between the backdoor pattern and the associated target class than that between normal features and the true class of origin. By contrast, self-supervised representation learning ignores the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsTest
