User-Level Private Learning via Correlated Sampling

Badih Ghazi; Ravi Kumar; Pasin Manurangsi

arXiv:2110.11208·cs.LG·December 28, 2021

User-Level Private Learning via Correlated Sampling

Badih Ghazi, Ravi Kumar, Pasin Manurangsi

PDF

Open Access

TL;DR

This paper introduces a new approach for user-level differential privacy in machine learning, demonstrating that with enough samples per user, learning can be achieved with significantly fewer users than traditional methods.

Contribution

It presents a novel correlated sampling technique that enhances global stability under public randomness, enabling efficient user-level private learning with tight bounds.

Findings

01

Achieves learning with O(log(1/δ)/ε) users for (ε, δ)-DP.

02

Learns with O_ε(d) users in the local model for ε-DP.

03

Provides nearly-matching lower bounds on user requirements.

Abstract

Most works in learning with differential privacy (DP) have focused on the setting where each user has a single sample. In this work, we consider the setting where each user holds $m$ samples and the privacy protection is enforced at the level of each user's data. We show that, in this setting, we may learn with a much fewer number of users. Specifically, we show that, as long as each user receives sufficiently many samples, we can learn any privately learnable class via an $(ϵ, δ)$ -DP algorithm using only $O (lo g (1/ δ) / ϵ)$ users. For $ϵ$ -DP algorithms, we show that we can learn using only $O_{ϵ} (d)$ users even in the local model, where $d$ is the probabilistic representation dimension. In both cases, we show a nearly-matching lower bound on the number of users required. A crucial component of our results is a generalization of global stability [Bun…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Machine Learning and Algorithms