Fed-Sophia: A Communication-Efficient Second-Order Federated Learning Algorithm
Ahmed Elbakary, Chaouki Ben Issaid, Mohammad Shehab, Karim Seddik,, Tamer ElBatt, Mehdi Bennis

TL;DR
Fed-Sophia is a scalable second-order federated learning algorithm that efficiently incorporates curvature information to accelerate convergence in large models, outperforming existing methods.
Contribution
It introduces a novel second-order federated learning method combining gradient averaging, clipping, and Hessian diagonal estimation for improved scalability and robustness.
Findings
Fed-Sophia outperforms first and second-order baselines.
It demonstrates robustness across various scenarios.
The method scales well with large models.
Abstract
Federated learning is a machine learning approach where multiple devices collaboratively learn with the help of a parameter server by sharing only their local updates. While gradient-based optimization techniques are widely adopted in this domain, the curvature information that second-order methods exhibit is crucial to guide and speed up the convergence. This paper introduces a scalable second-order method, allowing the adoption of curvature information in federated large models. Our method, coined Fed-Sophia, combines a weighted moving average of the gradient with a clipping operation to find the descent direction. In addition to that, a lightweight estimation of the Hessian's diagonal is used to incorporate the curvature information. Numerical evaluation shows the superiority, robustness, and scalability of the proposed Fed-Sophia scheme compared to first and second-order baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Cooperative Communication and Network Coding
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
