Differentially-Private Clustering of Easy Instances

Edith Cohen; Haim Kaplan; Yishay Mansour; Uri Stemmer; Eliad Tsfadia

arXiv:2112.14445·cs.LG·December 30, 2021·5 cites

Differentially-Private Clustering of Easy Instances

Edith Cohen, Haim Kaplan, Yishay Mansour, Uri Stemmer, Eliad Tsfadia

PDF

Open Access 1 Video

TL;DR

This paper introduces a simple, practical differentially private clustering framework that performs well on easy instances with well-separated clusters, improving sample complexity bounds and demonstrating effectiveness through empirical tests.

Contribution

It presents a novel framework that applies non-private clustering algorithms to easy instances and combines results privately, enhancing practical utility in differentially private clustering.

Findings

01

Improved sample complexity bounds for Gaussian mixtures and k-means.

02

Effective empirical performance on synthetic data.

03

Framework enabling practical private clustering on easy instances.

Abstract

Clustering is a fundamental problem in data analysis. In differentially private clustering, the goal is to identify $k$ cluster centers without disclosing information on individual data points. Despite significant research progress, the problem had so far resisted practical solutions. In this work we aim at providing simple implementable differentially private clustering algorithms that provide utility when the data is "easy," e.g., when there exists a significant separation between the clusters. We propose a framework that allows us to apply non-private clustering algorithms to the easy instances and privately combine the results. We are able to get improved sample complexity bounds in some cases of Gaussian mixtures and $k$ -means. We complement our theoretical analysis with an empirical evaluation on synthetic data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Differentially-Private Clustering of Easy Instances· slideslive

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Mobile Crowdsensing and Crowdsourcing