Throughput-Optimal Topology Design for Cross-Silo Federated Learning
Othmane Marfoq, Chuan Xu, Giovanni Neglia, Richard Vidal

TL;DR
This paper introduces a new topology design approach for cross-silo federated learning, optimizing communication throughput to significantly accelerate training compared to traditional architectures.
Contribution
It formulates the topology design problem using max-plus linear systems and proposes algorithms that maximize throughput with provable guarantees.
Findings
Algorithms achieve up to 9x speedup over master-slave architecture.
Significant speedups with slower access links.
Practical algorithms outperform state-of-the-art methods.
Abstract
Federated learning usually employs a client-server architecture where an orchestrator iteratively aggregates model updates from remote clients and pushes them back a refined model. This approach may be inefficient in cross-silo settings, as close-by data silos with high-speed access links may exchange information faster than with the orchestrator, and the orchestrator may become a communication bottleneck. In this paper we define the problem of topology design for cross-silo federated learning using the theory of max-plus linear systems to compute the system throughput---number of communication rounds per time unit. We also propose practical algorithms that, under the knowledge of measurable network characteristics, find a topology with the largest throughput or with provable throughput guarantees. In realistic Internet networks with 10 Gbps access links for silos, our algorithms speed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Ferroelectric and Negative Capacitance Devices
