Differentially Private Federated Clustering with Random Rebalancing
Xiyuan Yang, Shengyuan Hu, Soyeon Kim, Tian Li

TL;DR
This paper introduces RR-Cluster, a simple rebalancing technique for federated clustering that reduces privacy noise and improves utility while maintaining privacy guarantees, validated through theoretical analysis and experiments.
Contribution
The paper proposes RR-Cluster, a lightweight rebalancing method that enhances privacy-utility tradeoffs in federated clustering by controlling cluster assignment sizes.
Findings
RR-Cluster reduces privacy noise variance.
Improves clustering utility under differential privacy.
Demonstrates effectiveness on synthetic and real datasets.
Abstract
Federated clustering aims to group similar clients into clusters and produce one model for each cluster. Such a personalization approach typically improves model performance compared with training a single model to serve all clients, but can be more vulnerable to privacy leakage. Directly applying client-level differentially private (DP) mechanisms to federated clustering could degrade the utilities significantly. We identify that such deficiencies are mainly due to the difficulties of averaging privacy noise within each cluster (following standard privacy mechanisms), as the number of clients assigned to the same clusters is uncontrolled. To this end, we propose a simple and effective technique, named RR-Cluster, that can be viewed as a light-weight add-on to many federated clustering algorithms. RR-Cluster achieves reduced privacy noise via randomly rebalancing cluster assignments,…
Peer Reviews
Decision·Submitted to ICLR 2026
1. The proposed work addresses improving the performance of existing clustered FL algorithms when they are enhanced with DP guarantees, which is an important problem. 2. An extensive set of experimental results are reported (however they need to be improved, see below)
While I have understood the point of the proposed idea completely, I strongly feel that the current experimental results do not evaluate it properly to validate the correctness of the claims in the paper. I list the existing weaknesses followed by my detailed questions in the next section for clarification. 1. The privacy setting considers a trusted server, which may not always be available in FL settings. 2. The experimental results are reported in an optimistic way that does not fully evalua
- The random rebalancing approach can be integrated with various clustered FL algorithms - The derivations seem correct and consistent with standard DP theory. - Consistently outperforms baselines across datasets and privacy budgets.
- The method assumes that rebalancing doesn't significantly hurt utility, but under concept shift (could be adversarial or not), incorrect assignments could accumulate bias. - Experiments focus on classification tasks with synthetic and benchmark datasets (despite claim in abstract of the use of real world datasets) - Only average results reported.
1. The paper is clearly written and easy to follow. 2. The proposed RR-Cluster method is promising, as increasing the number of clients contributing to small clusters effectively mitigates the noise intensity introduced by differential privacy. 3. The authors conduct both theoretical (privacy and convergence) and empirical evaluations of RR-Cluster. The derived theoretical bounds also capture the bias–variance trade-off introduced by the proposed mechanism.
1. The paper lacks a detailed description of the defense and attack models, which is essential for readers to fully understand the assumptions and setup of the considered DP-FL system. 2. The proposed RR-Cluster method appears to rely on the assumption that the server is fully honest. However, its effectiveness may significantly degrade—or even vanish—under an honest-but-curious server model, limiting its practical applicability. 3. The final convergence bound presented in Corollary 1 seems ov
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Privacy, Security, and Data Protection · Cryptography and Data Security
