Semi-supervised Knowledge Transfer for Deep Learning from Private   Training Data

Nicolas Papernot; Mart\'in Abadi; \'Ulfar Erlingsson; Ian Goodfellow,; Kunal Talwar

arXiv:1610.05755·stat.ML·March 6, 2017·186 cites

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Nicolas Papernot, Mart\'in Abadi, \'Ulfar Erlingsson, Ian Goodfellow,, Kunal Talwar

PDF

Open Access 5 Repos

TL;DR

This paper introduces PATE, a privacy-preserving semi-supervised learning framework that uses multiple teacher models trained on sensitive data to guide a student model, ensuring strong privacy guarantees while maintaining high utility.

Contribution

The paper presents PATE, a novel semi-supervised learning approach that provides differential privacy guarantees using an ensemble of teacher models trained on disjoint datasets.

Findings

01

Achieved state-of-the-art privacy-utility trade-offs on MNIST and SVHN datasets.

02

Applicable to any model type, including deep neural networks.

03

Provides formal differential privacy guarantees even under black-box access.

Abstract

Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data: Private Aggregation of Teacher Ensembles (PATE). The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as "teachers" for a "student" model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning