Asymptotic Analysis of Conditioned Stochastic Gradient Descent

R\'emi Leluc; Fran\c{c}ois Portier

arXiv:2006.02745·math.ST·October 17, 2023·1 cites

Asymptotic Analysis of Conditioned Stochastic Gradient Descent

R\'emi Leluc, Fran\c{c}ois Portier

PDF

Open Access 1 Repo

TL;DR

This paper provides an asymptotic analysis of Conditioned SGD algorithms, demonstrating their convergence properties and optimality when using inverse Hessian estimates, using martingale techniques in a discrete-time framework.

Contribution

It introduces a general framework for analyzing Conditioned SGD, establishing weak and almost sure convergence, and highlights asymptotic optimality with inverse Hessian conditioning.

Findings

01

Weak convergence of rescaled iterates established

02

Almost sure convergence results derived

03

Asymptotic normality linked to stochastic equicontinuity

Abstract

In this paper, we investigate a general class of stochastic gradient descent (SGD) algorithms, called Conditioned SGD, based on a preconditioning of the gradient direction. Using a discrete-time approach with martingale tools, we establish under mild assumptions the weak convergence of the rescaled sequence of iterates for a broad class of conditioning matrices including stochastic first-order and second-order methods. Almost sure convergence results, which may be of independent interest, are also presented. Interestingly, the asymptotic normality result consists in a stochastic equicontinuity property so when the conditioning matrix is an estimate of the inverse Hessian, the algorithm is asymptotically optimal.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

RemiLELUC/ConditionedSGD
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Random Matrices and Applications · Sparse and Compressive Sensing Techniques

MethodsStochastic Gradient Descent