Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning
Hongbin Liu, Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong

TL;DR
This paper systematically evaluates how pre-trained encoders in self-supervised learning enhance security and privacy in supervised learning, addressing limitations like accuracy loss and small security guarantees.
Contribution
It provides the first comprehensive measurement study demonstrating that pre-trained encoders improve security guarantees and accuracy in privacy-preserving supervised learning.
Findings
Pre-trained encoders improve accuracy under no attacks.
They enhance certified security against data poisoning and backdoor attacks.
They boost the effectiveness of differentially private classifiers.
Abstract
Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial examples on the security side as well as 2) inference attacks and the right to be forgotten for the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data
MethodsRandomized Smoothing
