Differentially Private $k$-Means Clustering
Dong Su, Jianneng Cao, Ninghui Li, Elisa Bertino, Hongxia Jin

TL;DR
This paper introduces a hybrid differentially private $k$-means clustering method that combines non-interactive data synopsis publication with interactive refinement, optimizing privacy budget allocation for improved clustering accuracy.
Contribution
It proposes a novel hybrid approach for differentially private $k$-means clustering that balances non-interactive and interactive methods, with analysis and experiments validating its effectiveness.
Findings
Hybrid approach improves clustering accuracy under privacy constraints.
Optimal privacy budget allocation enhances error performance.
Experimental results confirm theoretical analysis.
Abstract
There are two broad approaches for differentially private data analysis. The interactive approach aims at developing customized differentially private algorithms for various data mining tasks. The non-interactive approach aims at developing differentially private algorithms that can output a synopsis of the input dataset, which can then be used to support various data mining tasks. In this paper we study the tradeoff of interactive vs. non-interactive approaches and propose a hybrid approach that combines interactive and non-interactive, using -means clustering as an example. In the hybrid approach to differentially private -means clustering, one first uses a non-interactive mechanism to publish a synopsis of the input dataset, then applies the standard -means clustering algorithm to learn cluster centroids, and finally uses an interactive approach to further improve these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Mobile Crowdsensing and Crowdsourcing
