Communication-Efficient Federated Learning via Regularized Sparse Random Networks
Mohamad Mestoukirdi, Omid Esrafilian, David Gesbert, Qianrui Li,, Nicolas Gresset

TL;DR
This paper introduces a communication-efficient federated learning method that trains over-parameterized random networks by optimizing binary masks, significantly reducing communication costs while maintaining comparable accuracy.
Contribution
It proposes a novel regularization approach to optimize sparse binary masks, enabling efficient communication in federated learning with minimal performance loss.
Findings
Achieves up to five orders of magnitude reduction in communication and memory usage.
Maintains comparable validation accuracy with traditional methods.
Demonstrates effectiveness across extensive empirical experiments.
Abstract
This work presents a new method for enhancing communication efficiency in stochastic Federated Learning that trains over-parameterized random networks. In this setting, a binary mask is optimized instead of the model weights, which are kept fixed. The mask characterizes a sparse sub-network that is able to generalize as good as a smaller target network. Importantly, sparse binary masks are exchanged rather than the floating point weights in traditional federated learning, reducing communication cost to at most 1 bit per parameter (Bpp). We show that previous state of the art stochastic methods fail to find sparse networks that can reduce the communication and storage overhead using consistent loss objectives. To address this, we propose adding a regularization term to local objectives that acts as a proxy of the transmitted masks entropy, therefore encouraging sparser solutions by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
Methodsfail
