Local SGD for Near-Quadratic Problems: Improving Convergence under   Unconstrained Noise Conditions

Andrey Sadchikov; Savelii Chezhegov; Aleksandr Beznosikov; Alexander; Gasnikov

arXiv:2409.10478·math.OC·December 19, 2024

Local SGD for Near-Quadratic Problems: Improving Convergence under Unconstrained Noise Conditions

Andrey Sadchikov, Savelii Chezhegov, Aleksandr Beznosikov, Alexander, Gasnikov

PDF

Open Access

TL;DR

This paper extends the theoretical understanding of Local SGD by introducing the concept of approximate quadraticity and analyzing its convergence under unbounded noise conditions, broadening its applicability.

Contribution

It proposes a new framework for analyzing Local SGD on near-quadratic problems without relying on Lipschitz Hessian or bounded variance assumptions.

Findings

01

Convergence guarantees for Local SGD under approximate quadraticity.

02

Analysis of Local SGD with unbounded noise conditions.

03

Broader applicability of Local SGD in practical scenarios.

Abstract

Distributed optimization plays an important role in modern large-scale machine learning and data processing systems by optimizing the utilization of computational resources. One of the classical and popular approaches is Local Stochastic Gradient Descent (Local SGD), characterized by multiple local updates before averaging, which is particularly useful in distributed environments to reduce communication bottlenecks and improve scalability. A typical feature of this method is the dependence on the frequency of communications. But in the case of a quadratic target function with homogeneous data distribution over all devices, the influence of frequency of communications vanishes. As a natural consequence, subsequent studies include the assumption of a Lipschitz Hessian, as this indicates the similarity of the optimized function to a quadratic one to some extent. However, in order to extend…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Signal Denoising Methods · Numerical methods in inverse problems · Model Reduction and Neural Networks