Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point
Bokun Wang, Axel Berg, Durmus Alp Emre Acar, Chuteng Zhou

TL;DR
This paper explores the use of 8-bit floating point training in federated learning, significantly reducing communication costs and enabling efficient on-device training while maintaining model accuracy.
Contribution
It introduces a novel FP8 federated learning method with convergence analysis and demonstrates substantial communication savings across multiple models and datasets.
Findings
Achieves at least 2.9x reduction in communication costs
Maintains comparable accuracy to FP32 baseline
Validates effectiveness across diverse models and datasets
Abstract
Recent work has shown that 8-bit floating point (FP8) can be used for efficiently training neural networks with reduced computational cost compared to training in FP32/FP16. In this work, we investigate the use of FP8 training in a federated learning context. This approach brings not only the usual benefits of FP8 which are desirable for on-device training at the edge, but also reduces client-server communication costs due to significant weight compression. We present a novel method for combining FP8 client training while maintaining a global FP32 server model and provide convergence analysis. Experiments with various machine learning models and datasets show that our method consistently yields communication reductions of at least 2.9x across a variety of tasks and models compared to an FP32 baseline to achieve the same trained model accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
