BICompFL: Stochastic Federated Learning with Bi-Directional Compression

Maximilian Egger; Rawad Bitar; Antonia Wachter-Zeh; Nir Weinberger and; Deniz G\"und\"uz

arXiv:2502.00206·cs.LG·February 4, 2025

BICompFL: Stochastic Federated Learning with Bi-Directional Compression

Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh, Nir Weinberger and, Deniz G\"und\"uz

PDF

Open Access 3 Reviews

TL;DR

BICompFL introduces a bi-directional compression method for stochastic federated learning, significantly reducing communication costs while maintaining high accuracy, supported by theoretical analysis and experimental validation.

Contribution

The paper proposes BICompFL, a novel approach to bi-directional compression in stochastic federated learning, addressing inherent challenges and improving communication efficiency.

Findings

01

Reduces communication cost by an order of magnitude.

02

Maintains state-of-the-art accuracy.

03

Provides theoretical analysis of communication interplay.

Abstract

We address the prominent communication bottleneck in federated learning (FL). We specifically consider stochastic FL, in which models or compressed model updates are specified by distributions rather than deterministic parameters. Stochastic FL offers a principled approach to compression, and has been shown to reduce the communication load under perfect downlink transmission from the federator to the clients. However, in practice, both the uplink and downlink communications are constrained. We show that bi-directional compression for stochastic FL has inherent challenges, which we address by introducing BICompFL. Our BICompFL is experimentally shown to reduce the communication cost by an order of magnitude compared to multiple benchmarks, while maintaining state-of-the-art accuracies. Theoretically, we study the communication cost of BICompFL through a new analysis of an…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 5Confidence 3

Strengths

The results are strong and practical merits are high.

Weaknesses

In a nutshell, the proposed idea is to bring in a bidirectional element to FedPM of Isik et al. 2023. Probabilistic mask training, synchronized randomness and importance sampling are all from the existing work of Isik et al. 2023 and 2024. The only significant new element seems to be the introduction of downlink cost (as described by the n_DL dependent term in the first equation of section 5) in the analysis and performance evaluation. Thus, overall novelty is not high. Also, the authors would

Reviewer 02Rating 5Confidence 4

Strengths

The improvements in the communication bandwidth, for the same performance and theoretical analysis.

Weaknesses

Modification of Isik et. al which by itself is more of a narrow problem in Federated learning. There exists lots of proposals in this direction, so the paper while novel is not highly original. For instance, the paper by Philippenko et. al, Avdiukhin et. al, etc. This paper results are good, but I am not sure if it is good enough for a competitive venue like ICLR.

Reviewer 03Rating 3Confidence 5

Strengths

The attempt to study the shared randomness in a FL setting seems interesting.

Weaknesses

The are some issues with the paper: - An important issue is that, given this paper considers various schemes within a system model that includes shared common randomness, the performance of previous algorithms like FedAvg should also be re-evaluated under the assumption of shared randomness, with results reported in the paper for comparison. Specifically, it is unclear whether the improvement shown in Fig. 2 arises from the availability of shared randomness or from the algorithm itself. If the

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks