BICompFL: Stochastic Federated Learning with Bi-Directional Compression
Maximilian Egger, Rawad Bitar, Antonia Wachter-Zeh, Nir Weinberger and, Deniz G\"und\"uz

TL;DR
BICompFL introduces a bi-directional compression method for stochastic federated learning, significantly reducing communication costs while maintaining high accuracy, supported by theoretical analysis and experimental validation.
Contribution
The paper proposes BICompFL, a novel approach to bi-directional compression in stochastic federated learning, addressing inherent challenges and improving communication efficiency.
Findings
Reduces communication cost by an order of magnitude.
Maintains state-of-the-art accuracy.
Provides theoretical analysis of communication interplay.
Abstract
We address the prominent communication bottleneck in federated learning (FL). We specifically consider stochastic FL, in which models or compressed model updates are specified by distributions rather than deterministic parameters. Stochastic FL offers a principled approach to compression, and has been shown to reduce the communication load under perfect downlink transmission from the federator to the clients. However, in practice, both the uplink and downlink communications are constrained. We show that bi-directional compression for stochastic FL has inherent challenges, which we address by introducing BICompFL. Our BICompFL is experimentally shown to reduce the communication cost by an order of magnitude compared to multiple benchmarks, while maintaining state-of-the-art accuracies. Theoretically, we study the communication cost of BICompFL through a new analysis of an…
Peer Reviews
Decision·Submitted to ICLR 2025
The results are strong and practical merits are high.
In a nutshell, the proposed idea is to bring in a bidirectional element to FedPM of Isik et al. 2023. Probabilistic mask training, synchronized randomness and importance sampling are all from the existing work of Isik et al. 2023 and 2024. The only significant new element seems to be the introduction of downlink cost (as described by the n_DL dependent term in the first equation of section 5) in the analysis and performance evaluation. Thus, overall novelty is not high. Also, the authors would
The improvements in the communication bandwidth, for the same performance and theoretical analysis.
Modification of Isik et. al which by itself is more of a narrow problem in Federated learning. There exists lots of proposals in this direction, so the paper while novel is not highly original. For instance, the paper by Philippenko et. al, Avdiukhin et. al, etc. This paper results are good, but I am not sure if it is good enough for a competitive venue like ICLR.
The attempt to study the shared randomness in a FL setting seems interesting.
The are some issues with the paper: - An important issue is that, given this paper considers various schemes within a system model that includes shared common randomness, the performance of previous algorithms like FedAvg should also be re-evaluated under the assumption of shared randomness, with results reported in the paper for comparison. Specifically, it is unclear whether the improvement shown in Fig. 2 arises from the availability of shared randomness or from the algorithm itself. If the
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Graph Neural Networks
