Losing momentum in continuous-time stochastic optimisation

Kexin Jin; Jonas Latz; Chenguang Liu; Alessandro Scagliotti

arXiv:2209.03705·math.OC·November 6, 2024

Losing momentum in continuous-time stochastic optimisation

Kexin Jin, Jonas Latz, Chenguang Liu, Alessandro Scagliotti

PDF

Open Access

TL;DR

This paper introduces a continuous-time stochastic optimization model with momentum, analyzes its convergence properties, and proposes a discretisation scheme that performs competitively in machine learning tasks.

Contribution

It develops a novel continuous-time model for stochastic gradient descent with momentum and provides convergence analysis and a stable discretisation scheme.

Findings

01

The model converges to the global minimiser under certain conditions.

02

The discretisation scheme is stable and effective in practice.

03

The algorithm performs competitively on neural network training tasks.

Abstract

The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become particularly widespread. The stochasticity arises from data subsampling which reduces computational cost. Both, momentum and stochasticity help the algorithm to converge globally. In this work, we propose and analyse a continuous-time model for stochastic gradient descent with momentum. This model is a piecewise-deterministic Markov process that represents the optimiser by an underdamped dynamical system and the data subsampling through a stochastic switching. We investigate longtime limits, the subsampling-to-no-subsampling limit, and the momentum-to-no-momentum limit. We are particularly interested in the case of reducing the momentum over time.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Sparse and Compressive Sensing Techniques

MethodsTest