Learning Privately from Multiparty Data
Jihun Hamm, Paul Cao, Mikhail Belkin

TL;DR
This paper introduces a method to build accurate, differentially private classifiers from multi-party data without accessing private data directly, using ensemble knowledge transfer and risk weighting to ensure privacy and performance.
Contribution
It proposes a novel approach to create a global differentially private classifier by transferring knowledge from local classifiers, with a new risk weighting scheme to improve accuracy.
Findings
The private classifier's generalization error scales as $O(rac{1}{b5^2 M^2})$ with privacy parameter and number of parties.
The method achieves strong privacy guarantees with minimal performance loss in large multi-party settings.
Experimental results on activity recognition, intrusion detection, and URL classification validate the approach.
Abstract
Learning a classifier from private data collected by multiple parties is an important problem that has many potential applications. How can we build an accurate and differentially private global classifier by combining locally-trained classifiers from different parties, without access to any party's private data? We propose to transfer the `knowledge' of the local classifier ensemble by first creating labeled data from auxiliary unlabeled data, and then train a global -differentially private classifier. We show that majority voting is too sensitive and therefore propose a new risk weighted by class probabilities estimated from the ensemble. Relative to a non-private solution, our private solution has a generalization error bounded by where is the number of parties. This allows strong privacy without performance loss when is large, such as in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Mobile Crowdsensing and Crowdsourcing · Internet Traffic Analysis and Secure E-voting
