Papaya: Practical, Private, and Scalable Federated Learning

Dzmitry Huba; John Nguyen; Kshitiz Malik; Ruiyu Zhu; Mike Rabbat,; Ashkan Yousefpour; Carole-Jean Wu; Hongyuan Zhan; Pavel Ustinov; Harish; Srinivas; Kaikai Wang; Anthony Shoumikhin; Jesik Min; Mani Malek

arXiv:2111.04877·cs.LG·April 27, 2022·29 cites

Papaya: Practical, Private, and Scalable Federated Learning

Dzmitry Huba, John Nguyen, Kshitiz Malik, Ruiyu Zhu, Mike Rabbat,, Ashkan Yousefpour, Carole-Jean Wu, Hongyuan Zhan, Pavel Ustinov, Harish, Srinivas, Kaikai Wang, Anthony Shoumikhin, Jesik Min, Mani Malek

PDF

Open Access

TL;DR

This paper presents Papaya, a scalable and private asynchronous federated learning system that outperforms traditional synchronous methods in speed and communication efficiency on large-scale device networks.

Contribution

The paper introduces a practical asynchronous FL system design, addressing scalability, variability, and straggler issues, with empirical evidence of significant speed and efficiency improvements.

Findings

01

Asynchronous FL converges faster than synchronous FL on large-scale device networks.

02

Asynchronous FL achieves 5x faster training in high concurrency settings.

03

Asynchronous FL reduces communication overhead by nearly 8x.

Abstract

Cross-device Federated Learning (FL) is a distributed learning paradigm with several challenges that differentiate it from traditional distributed learning, variability in the system characteristics on each device, and millions of clients coordinating with a central server being primary ones. Most FL systems described in the literature are synchronous - they perform a synchronized aggregation of model updates from individual clients. Scaling synchronous FL is challenging since increasing the number of clients training in parallel leads to diminishing returns in training speed, analogous to large-batch training. Moreover, stragglers hinder synchronous FL training. In this work, we outline a production asynchronous FL system design. Our work tackles the aforementioned issues, sketches of some of the system design challenges and their solutions, and touches upon principles that emerged…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Stochastic Gradient Optimization Techniques