SRKCD: a stabilized Runge-Kutta method for stochastic optimization

Tony Stillfjord; M{\aa}ns Williamson

arXiv:2201.12782·math.OC·February 1, 2022

SRKCD: a stabilized Runge-Kutta method for stochastic optimization

Tony Stillfjord, M{\aa}ns Williamson

PDF

Open Access

TL;DR

This paper introduces a new family of stochastic optimization methods based on Runge-Kutta-Chebyshev schemes, offering larger step sizes and improved robustness over traditional methods like stochastic gradient descent, with proven convergence guarantees.

Contribution

The paper presents the first convergence proof for stochastic Runge-Kutta methods and demonstrates their stability and effectiveness in optimization tasks.

Findings

01

Convergence in expectation with optimal sublinear rate for convex functions.

02

Convergence to zero gradients for non-convex objectives.

03

Numerical experiments show improved stability and performance.

Abstract

We introduce a family of stochastic optimization methods based on the Runge-Kutta-Chebyshev (RKC) schemes. The RKC methods are explicit methods originally designed for solving stiff ordinary differential equations by ensuring that their stability regions are of maximal size.In the optimization context, this allows for larger step sizes (learning rates) and better robustness compared to e.g. the popular stochastic gradient descent method. Our main contribution is a convergence proof for essentially all stochastic Runge-Kutta optimization methods. This shows convergence in expectation with an optimal sublinear rate under standard assumptions of strong convexity and Lipschitz-continuous gradients. For non-convex objectives, we get convergence to zero in expectation of the gradients. The proof requires certain natural conditions on the Runge-Kutta coefficients, and we further demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Stochastic processes and financial applications