SRKCD: a stabilized Runge-Kutta method for stochastic optimization
Tony Stillfjord, M{\aa}ns Williamson

TL;DR
This paper introduces a new family of stochastic optimization methods based on Runge-Kutta-Chebyshev schemes, offering larger step sizes and improved robustness over traditional methods like stochastic gradient descent, with proven convergence guarantees.
Contribution
The paper presents the first convergence proof for stochastic Runge-Kutta methods and demonstrates their stability and effectiveness in optimization tasks.
Findings
Convergence in expectation with optimal sublinear rate for convex functions.
Convergence to zero gradients for non-convex objectives.
Numerical experiments show improved stability and performance.
Abstract
We introduce a family of stochastic optimization methods based on the Runge-Kutta-Chebyshev (RKC) schemes. The RKC methods are explicit methods originally designed for solving stiff ordinary differential equations by ensuring that their stability regions are of maximal size.In the optimization context, this allows for larger step sizes (learning rates) and better robustness compared to e.g. the popular stochastic gradient descent method. Our main contribution is a convergence proof for essentially all stochastic Runge-Kutta optimization methods. This shows convergence in expectation with an optimal sublinear rate under standard assumptions of strong convexity and Lipschitz-continuous gradients. For non-convex objectives, we get convergence to zero in expectation of the gradients. The proof requires certain natural conditions on the Runge-Kutta coefficients, and we further demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Stochastic processes and financial applications
