Cohort Squeeze: Beyond a Single Communication Round per Cohort in   Cross-Device Federated Learning

Kai Yi; Timur Kharisov; Igor Sokolov; Peter Richt\'arik

arXiv:2406.01115·cs.LG·June 4, 2024

Cohort Squeeze: Beyond a Single Communication Round per Cohort in Cross-Device Federated Learning

Kai Yi, Timur Kharisov, Igor Sokolov, Peter Richt\'arik

PDF

Open Access

TL;DR

This paper proposes a novel federated learning approach that allows multiple communication rounds per cohort, significantly reducing total communication costs in cross-device FL by up to 74%, using a new stochastic proximal point method.

Contribution

It introduces SPPM-AS, a new method enabling multiple communication rounds per cohort, challenging the single-round paradigm in federated learning.

Findings

01

Achieves up to 74% reduction in communication costs.

02

Supports various client sampling procedures for improved efficiency.

03

Outperforms traditional single-round methods in cross-device FL.

Abstract

Virtually all federated learning (FL) methods, including FedAvg, operate in the following manner: i) an orchestrating server sends the current model parameters to a cohort of clients selected via certain rule, ii) these clients then independently perform a local training procedure (e.g., via SGD or Adam) using their own training data, and iii) the resulting models are shipped to the server for aggregation. This process is repeated until a model of suitable quality is found. A notable feature of these methods is that each cohort is involved in a single communication round with the server only. In this work we challenge this algorithmic design primitive and investigate whether it is possible to ``squeeze more juice" out of each cohort than what is possible in a single communication round. Surprisingly, we find that this is indeed the case, and our approach leads to up to 74% reduction in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Age of Information Optimization

MethodsStochastic Gradient Descent