Versatile Single-Loop Method for Gradient Estimator: First and Second   Order Optimality, and its Application to Federated Learning

Kazusato Oko; Shunta Akiyama; Tomoya Murata; and Taiji Suzuki

arXiv:2209.00361·cs.LG·October 5, 2022

Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning

Kazusato Oko, Shunta Akiyama, Tomoya Murata, and Taiji Suzuki

PDF

Open Access

TL;DR

This paper introduces SLEDGE, a versatile single-loop gradient estimation method for nonconvex optimization that achieves near-optimal complexity, supports second-order optimality, and enhances federated learning efficiency.

Contribution

The paper presents SLEDGE, a novel single-loop algorithm that avoids periodic gradient refreshes and attains multiple optimality guarantees, improving federated learning performance.

Findings

01

SLEDGE achieves nearly optimal gradient complexity.

02

The method guarantees second-order optimality and exponential convergence.

03

SLEDGE outperforms existing federated learning algorithms in communication efficiency.

Abstract

While variance reduction methods have shown great success in solving large scale optimization problems, many of them suffer from accumulated errors and, therefore, should periodically require the full gradient computation. In this paper, we present a single-loop algorithm named SLEDGE (Single-Loop mEthoD for Gradient Estimator) for finite-sum nonconvex optimization, which does not require periodic refresh of the gradient estimator but achieves nearly optimal gradient complexity. Unlike existing methods, SLEDGE has the advantage of versatility; (i) second-order optimality, (ii) exponential convergence in the PL region, and (iii) smaller complexity under less heterogeneity of data. We build an efficient federated learning algorithm by exploiting these favorable properties. We show the first and second-order optimality of the output and also provide analysis under PL conditions. When the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Bone and Joint Diseases