
TL;DR
This paper provides a comprehensive convergence analysis of biased nonconvex stochastic gradient descent, covering various convergence types and rates under mild assumptions, enhancing theoretical understanding of SGD in machine learning.
Contribution
It offers a full scope convergence study of biased nonconvex SGD, including weak, function-value, and global convergence, with convergence rates and complexities under mild conditions.
Findings
Established weak convergence of biased nonconvex SGD
Derived convergence rates and complexities
Provided conditions under which convergence guarantees hold
Abstract
Stochastic gradient descent (SGD) has been a go-to algorithm for nonconvex stochastic optimization problems arising in machine learning. Its theory however often requires a strong framework to guarantee convergence properties. We hereby present a full scope convergence study of biased nonconvex SGD, including weak convergence, function-value convergence and global convergence, and also provide subsequent convergence rates and complexities, all under relatively mild conditions in comparison with literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsStochastic Gradient Descent
