Towards Continuous-Time Approximations for Stochastic Gradient Descent without Replacement
Stefan Perko

TL;DR
This paper introduces a novel continuous-time approximation for stochastic gradient descent without replacement (SGDo), providing theoretical convergence guarantees and insights into its learning dynamics.
Contribution
It proposes a stochastic continuous-time model for SGDo using Young differential equations driven by epoched Brownian motion, with proven convergence for strongly convex objectives.
Findings
Proves almost sure convergence of the approximation for certain learning rates.
Derives an upper bound on the asymptotic convergence rate.
Shows the approximation's convergence rate matches or exceeds previous SGDo results.
Abstract
Gradient optimization algorithms using epochs, that is those based on stochastic gradient descent without replacement (SGDo), are predominantly used to train machine learning models in practice. However, the mathematical theory of SGDo and related algorithms remain underexplored compared to their "with replacement" and "one-pass" counterparts. In this article, we propose a stochastic, continuous-time approximation to SGDo with additive noise based on a Young differential equation driven by a stochastic process we call an "epoched Brownian motion". We show its usefulness by proving the almost sure convergence of the continuous-time approximation for strongly convex objectives and learning rate schedules of the form . Moreover, we compute an upper bound on the asymptotic rate of almost sure convergence, which is as good or better than previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Markov Chains and Monte Carlo Methods
