Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets
Federico Lucchetti, J\'er\'emie Decouchant, Maria Fernandes, Lydia Y., Chen, Marcus V\"olp

TL;DR
FedGMCC introduces a novel federated clustering framework using Monte Carlo methods to better handle non-IID datasets with diverse features and labels, significantly improving convergence and accuracy.
Contribution
The paper presents FedGMCC, a new clustering-based federated learning approach that accounts for both non-IID features and labels, enhancing model aggregation and performance.
Findings
Outperforms FedAvg and FedProx in convergence rates (+63%)
Achieves +4% accuracy improvement on genomic datasets
Effective in high non-IID feature and label incongruency scenarios
Abstract
Federated learning allows clients to collaboratively train models on datasets that are acquired in different locations and that cannot be exchanged because of their size or regulations. Such collected data is increasingly non-independent and non-identically distributed (non-IID), negatively affecting training accuracy. Previous works tried to mitigate the effects of non-IID datasets on training accuracy, focusing mainly on non-IID labels, however practical datasets often also contain non-IID features. To address both non-IID labels and features, we propose FedGMCC, a novel framework where a central server aggregates client models that it can cluster together. FedGMCC clustering relies on a Monte Carlo procedure that samples the output space of client models, infers their position in the weight space on a loss manifold and computes their geometric connection via an affine curve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Face recognition and analysis
