Personalized Federated Learning with Communication Compression
El Houcine Bergou, Konstantin Burlachenko, Aritra Dutta, Peter, Richt\'arik

TL;DR
This paper introduces a communication-efficient personalized federated learning algorithm that combines bidirectional compression with a probabilistic communication protocol, maintaining convergence rates while reducing communication overhead.
Contribution
It extends the L2GD algorithm with bidirectional compression and probabilistic communication, improving communication efficiency in personalized federated learning.
Findings
Maintains convergence rate similar to vanilla SGD without compression
Reduces communication bottleneck through bidirectional compression
Effective on both convex and non-convex problems
Abstract
In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data across the devices. Recently, Hanzely and Richt\'{a}rik (2020) proposed a new formulation for training personalized FL models aimed at balancing the trade-off between the traditional global model and the local models that could be trained by individual devices using their private data only. They derived a new algorithm, called Loopless Gradient Descent (L2GD), to solve it and showed that this algorithms leads to improved communication complexity guarantees in regimes when more personalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Age of Information Optimization
MethodsStochastic Gradient Descent
