Cohort Squeeze: Beyond a Single Communication Round per Cohort in Cross-Device Federated Learning
Kai Yi, Timur Kharisov, Igor Sokolov, Peter Richt\'arik

TL;DR
This paper proposes a novel federated learning approach that allows multiple communication rounds per cohort, significantly reducing total communication costs in cross-device FL by up to 74%, using a new stochastic proximal point method.
Contribution
It introduces SPPM-AS, a new method enabling multiple communication rounds per cohort, challenging the single-round paradigm in federated learning.
Findings
Achieves up to 74% reduction in communication costs.
Supports various client sampling procedures for improved efficiency.
Outperforms traditional single-round methods in cross-device FL.
Abstract
Virtually all federated learning (FL) methods, including FedAvg, operate in the following manner: i) an orchestrating server sends the current model parameters to a cohort of clients selected via certain rule, ii) these clients then independently perform a local training procedure (e.g., via SGD or Adam) using their own training data, and iii) the resulting models are shipped to the server for aggregation. This process is repeated until a model of suitable quality is found. A notable feature of these methods is that each cohort is involved in a single communication round with the server only. In this work we challenge this algorithmic design primitive and investigate whether it is possible to ``squeeze more juice" out of each cohort than what is possible in a single communication round. Surprisingly, we find that this is indeed the case, and our approach leads to up to 74% reduction in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Age of Information Optimization
MethodsStochastic Gradient Descent
