The Value of Collaboration in Convex Machine Learning with Differential   Privacy

Nan Wu; Farhad Farokhi; David Smith; Mohamed Ali Kaafar

arXiv:1906.09679·cs.CR·July 3, 2019·6 cites

The Value of Collaboration in Convex Machine Learning with Differential Privacy

Nan Wu, Farhad Farokhi, David Smith, Mohamed Ali Kaafar

PDF

Open Access

TL;DR

This paper analyzes how collaboration among multiple privacy-aware data owners affects the utility of convex machine learning models trained with differential privacy, providing predictive insights into privacy-utility trade-offs.

Contribution

It introduces a model to predict the impact of privacy parameters and dataset size on the quality of differentially-private machine learning models, validated on real financial and fraud detection datasets.

Findings

01

Model predicts privacy-utility trade-offs based on dataset size and privacy budget.

02

Validation confirms the accuracy of the performance prediction.

03

Collaboration benefits increase with larger datasets and higher privacy budgets.

Abstract

In this paper, we apply machine learning to distributed private data owned by multiple data owners, entities with access to non-overlapping training datasets. We use noisy, differentially-private gradients to minimize the fitness cost of the machine learning model using stochastic gradient descent. We quantify the quality of the trained model, using the fitness cost, as a function of privacy budget and size of the distributed datasets to capture the trade-off between privacy and utility in machine learning. This way, we can predict the outcome of collaboration among privacy-aware data owners prior to executing potentially computationally-expensive machine learning algorithms. Particularly, we show that the difference between the fitness of the trained machine learning model using differentially-private gradient queries and the fitness of the trained machine model in the absence of any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security