Limit Theorems for Stochastic Gradient Descent with Infinite Variance
Jose Blanchet, Aleksandar Mijatovi\'c, Wenhao Yang

TL;DR
This paper investigates the long-term behavior of stochastic gradient descent when gradients have infinite variance, extending classical results to multidimensional cases with stable Lévy processes.
Contribution
It extends the theoretical understanding of SGD with infinite variance gradients to multidimensional settings, characterizing its asymptotic distribution as a stable Lévy-driven Ornstein-Uhlenbeck process.
Findings
Asymptotic distribution characterized by stable Lévy process
Extension from 1D to multidimensional case
Applications demonstrated in linear and logistic regression
Abstract
Stochastic gradient descent is a classic algorithm that has gained great popularity especially in the last decades as the most common approach for training models in machine learning. While the algorithm has been well-studied when stochastic gradients are assumed to have a finite variance, there is significantly less research addressing its theoretical properties in the case of infinite variance gradients. In this paper, we establish the asymptotic behavior of stochastic gradient descent in the context of infinite variance stochastic gradients, assuming that the stochastic gradient is regular varying with index . The closest result in this context was established in 1969 , in the one-dimensional case and assuming that stochastic gradients belong to a more restrictive class of distributions. We extend it to the multidimensional case, covering a broader class of infinite…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPoint processes and geometric inequalities · Markov Chains and Monte Carlo Methods · Geometric Analysis and Curvature Flows
MethodsLinear Regression · Logistic Regression
