ESTAS: Effective and Stable Trojan Attacks in Self-supervised Encoders with One Target Unlabelled Sample
Jiaqi Xue, Qian Lou

TL;DR
This paper introduces ESTAS, a novel Trojan attack method on self-supervised learning encoders that achieves high success rates using only one unlabeled target sample, highlighting security vulnerabilities in SSL models.
Contribution
ESTAS is the first attack to effectively and stably compromise SSL encoders with a single unlabeled target sample, using innovative trigger poisoning and cascade optimization techniques.
Findings
Achieves over 99% attack success rate with one target sample
Outperforms prior methods with >30% higher success rate
Improves model accuracy by over 8.3% on average
Abstract
Emerging self-supervised learning (SSL) has become a popular image representation encoding method to obviate the reliance on labeled data and learn rich representations from large-scale, ubiquitous unlabelled data. Then one can train a downstream classifier on top of the pre-trained SSL image encoder with few or no labeled downstream data. Although extensive works show that SSL has achieved remarkable and competitive performance on different downstream tasks, its security concerns, e.g, Trojan attacks in SSL encoders, are still not well-studied. In this work, we present a novel Trojan Attack method, denoted by ESTAS, that can enable an effective and stable attack in SSL encoders with only one target unlabeled sample. In particular, we propose consistent trigger poisoning and cascade optimization in ESTAS to improve attack efficacy and model accuracy, and eliminate the expensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Electrostatic Discharge in Electronics
