Privacy-Preserving Optimal Parameter Selection for Collaborative Clustering
Maryam Ghasemian, Erman Ayday

TL;DR
This paper proposes a privacy-preserving method for selecting optimal parameters in collaborative clustering, balancing data privacy with clustering quality using differential privacy techniques.
Contribution
It introduces a framework that recommends clustering algorithms and parameters while protecting data privacy through differential privacy, specifically using the Randomized Response mechanism.
Findings
Privacy parameter $$ has minimal effect on recommendations.
Increasing $$ raises risk of membership inference attacks.
Differential privacy maintains clustering quality with high metrics.
Abstract
This study investigates the optimal selection of parameters for collaborative clustering while ensuring data privacy. We focus on key clustering algorithms within a collaborative framework, where multiple data owners combine their data. A semi-trusted server assists in recommending the most suitable clustering algorithm and its parameters. Our findings indicate that the privacy parameter () minimally impacts the server's recommendations, but an increase in raises the risk of membership inference attacks, where sensitive information might be inferred. To mitigate these risks, we implement differential privacy techniques, particularly the Randomized Response mechanism, to add noise and protect data privacy. Our approach demonstrates that high-quality clustering can be achieved while maintaining data confidentiality, as evidenced by metrics such as the Adjusted Rand…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Customer churn and segmentation
MethodsFocus
