Pre-trained Encoders in Self-Supervised Learning Improve Secure and   Privacy-preserving Supervised Learning

Hongbin Liu; Wenjie Qu; Jinyuan Jia; Neil Zhenqiang Gong

arXiv:2212.03334·cs.CR·December 8, 2022

Pre-trained Encoders in Self-Supervised Learning Improve Secure and Privacy-preserving Supervised Learning

Hongbin Liu, Wenjie Qu, Jinyuan Jia, Neil Zhenqiang Gong

PDF

Open Access

TL;DR

This paper systematically evaluates how pre-trained encoders in self-supervised learning enhance security and privacy in supervised learning, addressing limitations like accuracy loss and small security guarantees.

Contribution

It provides the first comprehensive measurement study demonstrating that pre-trained encoders improve security guarantees and accuracy in privacy-preserving supervised learning.

Findings

01

Pre-trained encoders improve accuracy under no attacks.

02

They enhance certified security against data poisoning and backdoor attacks.

03

They boost the effectiveness of differentially private classifiers.

Abstract

Classifiers in supervised learning have various security and privacy issues, e.g., 1) data poisoning attacks, backdoor attacks, and adversarial examples on the security side as well as 2) inference attacks and the right to be forgotten for the training data on the privacy side. Various secure and privacy-preserving supervised learning algorithms with formal guarantees have been proposed to address these issues. However, they suffer from various limitations such as accuracy loss, small certified security guarantees, and/or inefficiency. Self-supervised learning is an emerging technique to pre-train encoders using unlabeled data. Given a pre-trained encoder as a feature extractor, supervised learning can train a simple yet accurate classifier using a small amount of labeled training data. In this work, we perform the first systematic, principled measurement study to understand whether and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data

MethodsRandomized Smoothing