M22: A Communication-Efficient Algorithm for Federated Learning Inspired   by Rate-Distortion

Yangyi Liu; Stefano Rini; Sadaf Salehkalaibar; Jun Chen

arXiv:2301.09269·cs.LG·January 24, 2023

M22: A Communication-Efficient Algorithm for Federated Learning Inspired by Rate-Distortion

Yangyi Liu, Stefano Rini, Sadaf Salehkalaibar, Jun Chen

PDF

Open Access 1 Repo

TL;DR

This paper introduces M22, a rate-distortion inspired gradient compression algorithm for federated learning, optimizing communication efficiency while maintaining model accuracy through novel distortion measures and distribution assumptions.

Contribution

The paper proposes a new gradient compression method based on rate-distortion theory, with a family of distortion measures and distribution assumptions, improving communication efficiency in federated learning.

Findings

01

M22 algorithm achieves better accuracy per bit of communication.

02

Optimal gradient distribution and distortion measure choices significantly impact performance.

03

The method provides substantial improvements over existing compression techniques.

Abstract

In federated learning (FL), the communication constraint between the remote learners and the Parameter Server (PS) is a crucial bottleneck. For this reason, model updates must be compressed so as to minimize the loss in accuracy resulting from the communication constraint. This paper proposes ``\emph{ $M$ -magnitude weighted $L_{2}$ distortion + $2$ degrees of freedom''} (M22) algorithm, a rate-distortion inspired approach to gradient compression for federated training of deep neural networks (DNNs). In particular, we propose a family of distortion measures between the original gradient and the reconstruction we referred to as `` $M$ -magnitude weighted $L_{2}$ '' distortion, and we assume that gradient updates follow an i.i.d. distribution -- generalized normal or Weibull, which have two degrees of freedom. In both the distortion measure and the gradient, there is one free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yangyiliu21/fl_rd
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning