Unlocking FedNL: Self-Contained Compute-Optimized Implementation
Konstantin Burlachenko, Peter Richt\'arik

TL;DR
This paper introduces a self-contained, compute-optimized implementation of FedNL, significantly improving practical deployment and performance for federated learning with second-order methods.
Contribution
The work provides a practical, self-contained FedNL implementation, reducing setup time by 1000x and enabling multi-node and resource-constrained applications.
Findings
FedNL outperforms CVXPY in single-node logistic regression training.
FedNL surpasses Apache Spark and Ray/Scikit-Learn in multi-node settings.
Implementation reduces wall clock time by 1000x.
Abstract
Federated Learning (FL) is an emerging paradigm that enables intelligent agents to collaboratively train Machine Learning (ML) models in a distributed manner, eliminating the need for sharing their local data. The recent work (arXiv:2106.02969) introduces a family of Federated Newton Learn (FedNL) algorithms, marking a significant step towards applying second-order methods to FL and large-scale optimization. However, the reference FedNL prototype exhibits three serious practical drawbacks: (i) It requires 4.8 hours to launch a single experiment in a sever-grade workstation; (ii) The prototype only simulates multi-node setting; (iii) Prototype integration into resource-constrained applications is challenging. To bridge the gap between theory and practice, we present a self-contained implementation of FedNL, FedNL-LS, FedNL-PP for single-node and multi-node settings. Our work resolves the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Database Systems and Queries
MethodsLogistic Regression
