Privacy-Preserving Optimal Parameter Selection for Collaborative   Clustering

Maryam Ghasemian; Erman Ayday

arXiv:2406.05545·cs.LG·June 11, 2024

Privacy-Preserving Optimal Parameter Selection for Collaborative Clustering

Maryam Ghasemian, Erman Ayday

PDF

Open Access

TL;DR

This paper proposes a privacy-preserving method for selecting optimal parameters in collaborative clustering, balancing data privacy with clustering quality using differential privacy techniques.

Contribution

It introduces a framework that recommends clustering algorithms and parameters while protecting data privacy through differential privacy, specifically using the Randomized Response mechanism.

Findings

01

Privacy parameter $$ has minimal effect on recommendations.

02

Increasing $$ raises risk of membership inference attacks.

03

Differential privacy maintains clustering quality with high metrics.

Abstract

This study investigates the optimal selection of parameters for collaborative clustering while ensuring data privacy. We focus on key clustering algorithms within a collaborative framework, where multiple data owners combine their data. A semi-trusted server assists in recommending the most suitable clustering algorithm and its parameters. Our findings indicate that the privacy parameter ( $ϵ$ ) minimally impacts the server's recommendations, but an increase in $ϵ$ raises the risk of membership inference attacks, where sensitive information might be inferred. To mitigate these risks, we implement differential privacy techniques, particularly the Randomized Response mechanism, to add noise and protect data privacy. Our approach demonstrates that high-quality clustering can be achieved while maintaining data confidentiality, as evidenced by metrics such as the Adjusted Rand…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Customer churn and segmentation

MethodsFocus