Personalized Federated Learning with Communication Compression

El Houcine Bergou; Konstantin Burlachenko; Aritra Dutta; Peter; Richt\'arik

arXiv:2209.05148·cs.LG·September 13, 2022·1 cites

Personalized Federated Learning with Communication Compression

El Houcine Bergou, Konstantin Burlachenko, Aritra Dutta, Peter, Richt\'arik

PDF

Open Access

TL;DR

This paper introduces a communication-efficient personalized federated learning algorithm that combines bidirectional compression with a probabilistic communication protocol, maintaining convergence rates while reducing communication overhead.

Contribution

It extends the L2GD algorithm with bidirectional compression and probabilistic communication, improving communication efficiency in personalized federated learning.

Findings

01

Maintains convergence rate similar to vanilla SGD without compression

02

Reduces communication bottleneck through bidirectional compression

03

Effective on both convex and non-convex problems

Abstract

In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data across the devices. Recently, Hanzely and Richt\'{a}rik (2020) proposed a new formulation for training personalized FL models aimed at balancing the trade-off between the traditional global model and the local models that could be trained by individual devices using their private data only. They derived a new algorithm, called Loopless Gradient Descent (L2GD), to solve it and showed that this algorithms leads to improved communication complexity guarantees in regimes when more personalization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Age of Information Optimization

MethodsStochastic Gradient Descent