Convergence of continuous-time stochastic gradient descent with applications to deep neural networks

Gabor Lugosi; Eulalia Nualart

arXiv:2409.07401·cs.LG·November 3, 2025

Convergence of continuous-time stochastic gradient descent with applications to deep neural networks

Gabor Lugosi, Eulalia Nualart

PDF

Open Access

TL;DR

This paper analyzes the convergence of a continuous-time approximation of stochastic gradient descent, providing conditions for convergence and applying these results to overparametrized neural networks.

Contribution

It extends previous convergence results to stochastic gradient descent and demonstrates applicability to deep neural network training.

Findings

01

Established general convergence conditions for continuous-time stochastic gradient descent

02

Extended convergence analysis from nonstochastic to stochastic gradient descent

03

Applied theoretical results to overparametrized neural network training

Abstract

We study a continuous-time approximation of the stochastic gradient descent process for minimizing the population expected loss in learning problems. The main results establish general sufficient conditions for the convergence, extending the results of Chatterjee (2022) established for (nonstochastic) gradient descent. We show how the main result can be applied to the case of overparametrized neural network training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM