A new non-convex framework to improve asymptotical knowledge on generic   stochastic gradient descent

Jean-Baptiste Fest; Audrey Repetti; Emilie Chouzenoux

arXiv:2307.06987·math.OC·July 17, 2023·MLSP

A new non-convex framework to improve asymptotical knowledge on generic stochastic gradient descent

Jean-Baptiste Fest, Audrey Repetti, Emilie Chouzenoux

PDF

Open Access

TL;DR

This paper introduces a novel theoretical framework based on Kurdyka-Lojasiewicz theory to analyze the almost-sure convergence of stochastic gradient descent methods in non-convex optimization, providing new asymptotic guarantees.

Contribution

The paper presents a new Kurdyka-Lojasiewicz framework that establishes almost-sure convergence results for SGD in non-convex settings under mild conditions.

Findings

01

New convergence guarantees for SGD in non-convex optimization

02

Illustrations through toy simulation examples

03

Analysis of the impact of theoretical assumptions on SGD behavior

Abstract

Stochastic gradient optimization methods are broadly used to minimize non-convex smooth objective functions, for instance when training deep neural networks. However, theoretical guarantees on the asymptotic behaviour of these methods remain scarce. Especially, ensuring almost-sure convergence of the iterates to a stationary point is quite challenging. In this work, we introduce a new Kurdyka-Lojasiewicz theoretical framework to analyze asymptotic behavior of stochastic gradient descent (SGD) schemes when minimizing non-convex smooth objectives. In particular, our framework provides new almost-sure convergence results, on iterates generated by any SGD method satisfying mild conditional descent conditions. We illustrate the proposed framework by means of several toy simulation examples. We illustrate the role of the considered theoretical assumptions, and investigate how SGD iterates are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods