On Second-order Optimization Methods for Federated Learning
Sebastian Bischoff, Stephan G\"unnemann, Martin Jaggi, Sebastian U., Stich

TL;DR
This paper evaluates second-order optimization methods in federated learning, finding that FedAvg performs well and proposing a new second-order method with a global line search to improve convergence.
Contribution
The paper compares second-order methods with FedAvg in federated learning and introduces a novel second-order update with a global line search.
Findings
FedAvg performs competitively against second-order methods under fair metrics.
A new second-order method with a global line search is proposed.
Second-order methods can be effectively adapted for federated learning.
Abstract
We consider federated learning (FL), where the training data is distributed across a large number of clients. The standard optimization method in this setting is Federated Averaging (FedAvg), which performs multiple local first-order optimization steps between communication rounds. In this work, we evaluate the performance of several second-order distributed methods with local steps in the FL setting which promise to have favorable convergence properties. We (i) show that FedAvg performs surprisingly well against its second-order competitors when evaluated under fair metrics (equal amount of local computations)-in contrast to the results of previous work. Based on our numerical study, we propose (ii) a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security
