Sample-Efficient Private Learning of Mixtures of Gaussians

Hassan Ashtiani; Mahbod Majid; Shyam Narayanan

arXiv:2411.02298·cs.LG·November 5, 2024

Sample-Efficient Private Learning of Mixtures of Gaussians

Hassan Ashtiani, Mahbod Majid, Shyam Narayanan

PDF

Open Access 1 Video

TL;DR

This paper presents improved sample complexity bounds for differentially private learning of Gaussian mixtures, achieving near-optimal results especially in high dimensions and for univariate cases, using novel techniques.

Contribution

It introduces new sample complexity bounds for private Gaussian mixture learning, improving previous results and providing the first optimal bounds for univariate mixtures.

Findings

01

Sample complexity for high-dimensional mixtures is roughly $kd^2 + k^{1.5} d^{1.75} + k^2 d$.

02

Achieves linear-in-$k$ sample complexity for univariate Gaussian mixtures.

03

Provides algorithms utilizing inverse sensitivity, sample compression, and sumset volume bounding techniques.

Abstract

We study the problem of learning mixtures of Gaussians with approximate differential privacy. We prove that roughly $k d^{2} + k^{1.5} d^{1.75} + k^{2} d$ samples suffice to learn a mixture of $k$ arbitrary $d$ -dimensional Gaussians up to low total variation distance, with differential privacy. Our work improves over the previous best result [AAL24b] (which required roughly $k^{2} d^{4}$ samples) and is provably optimal when $d$ is much larger than $k^{2}$ . Moreover, we give the first optimal bound for privately learning mixtures of $k$ univariate (i.e., $1$ -dimensional) Gaussians. Importantly, we show that the sample complexity for privately learning mixtures of univariate Gaussians is linear in the number of components $k$ , whereas the previous best sample complexity [AAL21] was quadratic in $k$ . Our algorithms utilize various techniques, including the inverse sensitivity mechanism [AD20b,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sample-Efficient Private Learning of Mixtures of Gaussians· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Bayesian Methods and Mixture Models