Differentially Private Algorithms for Empirical Machine Learning
Ben Stoddard, Yan Chen, Ashwin Machanavajjhala

TL;DR
This paper introduces practical differentially private algorithms for training classifiers, feature selection, and ROC curve construction, improving accuracy and enabling private evaluation on real-world datasets.
Contribution
It presents novel private algorithms for feature selection and ROC curve construction that enhance the practicality of differentially private machine learning workflows.
Findings
Significant accuracy improvements on three real-world datasets.
First private algorithms for ROC curve construction.
Effective feature selection under differential privacy.
Abstract
An important use of private data is to build machine learning classifiers. While there is a burgeoning literature on differentially private classification algorithms, we find that they are not practical in real applications due to two reasons. First, existing differentially private classifiers provide poor accuracy on real world datasets. Second, there is no known differentially private algorithm for empirically evaluating the private classifier on a private test dataset. In this paper, we develop differentially private algorithms that mirror real world empirical machine learning workflows. We consider the private classifier training algorithm as a blackbox. We present private algorithms for selecting features that are input to the classifier. Though adding a preprocessing step takes away some of the privacy budget from the actual classification process (thus potentially making it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques
