Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic   Gradient Descent using Stochastic Learning Rates

Theodoros Mamalis; Dusan Stipanovic; Petros Voulgaris

arXiv:2110.12634·math.OC·November 11, 2021·1 cites

Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic Gradient Descent using Stochastic Learning Rates

Theodoros Mamalis, Dusan Stipanovic, Petros Voulgaris

PDF

Open Access

TL;DR

This paper demonstrates that incorporating stochastic learning rates into Stochastic Gradient Descent accelerates almost-sure convergence in nonconvex optimization, supported by theoretical analysis and empirical validation.

Contribution

It introduces a stochastic learning rate scheme for SGD that achieves faster almost-sure convergence rates in nonconvex problems, advancing optimization theory.

Findings

01

Accelerated convergence rates with stochastic learning rates

02

Theoretical proof of improved almost-sure convergence

03

Empirical validation confirming theoretical results

Abstract

Large-scale optimization problems require algorithms both effective and efficient. One such popular and proven algorithm is Stochastic Gradient Descent which uses first-order gradient information to solve these problems. This paper studies almost-sure convergence rates of the Stochastic Gradient Descent method when instead of deterministic, its learning rate becomes stochastic. In particular, its learning rate is equipped with a multiplicative stochasticity, producing a stochastic learning rate scheme. Theoretical results show accelerated almost-sure convergence rates of Stochastic Gradient Descent in a nonconvex setting when using an appropriate stochastic learning rate, compared to a deterministic-learning-rate scheme. The theoretical results are verified empirically.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods