On Second-order Optimization Methods for Federated Learning

Sebastian Bischoff; Stephan G\"unnemann; Martin Jaggi; Sebastian U.; Stich

arXiv:2109.02388·cs.LG·September 7, 2021

On Second-order Optimization Methods for Federated Learning

Sebastian Bischoff, Stephan G\"unnemann, Martin Jaggi, Sebastian U., Stich

PDF

Open Access

TL;DR

This paper evaluates second-order optimization methods in federated learning, finding that FedAvg performs well and proposing a new second-order method with a global line search to improve convergence.

Contribution

The paper compares second-order methods with FedAvg in federated learning and introduces a novel second-order update with a global line search.

Findings

01

FedAvg performs competitively against second-order methods under fair metrics.

02

A new second-order method with a global line search is proposed.

03

Second-order methods can be effectively adapted for federated learning.

Abstract

We consider federated learning (FL), where the training data is distributed across a large number of clients. The standard optimization method in this setting is Federated Averaging (FedAvg), which performs multiple local first-order optimization steps between communication rounds. In this work, we evaluate the performance of several second-order distributed methods with local steps in the FL setting which promise to have favorable convergence properties. We (i) show that FedAvg performs surprisingly well against its second-order competitors when evaluated under fair metrics (equal amount of local computations)-in contrast to the results of previous work. Based on our numerical study, we propose (ii) a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Cryptography and Data Security