A note on diffusion limits for stochastic gradient descent

Alberto Lanconelli; Christopher S. A. Lauria

arXiv:2210.11257·cs.LG·October 21, 2022·1 cites

A note on diffusion limits for stochastic gradient descent

Alberto Lanconelli, Christopher S. A. Lauria

PDF

Open Access

TL;DR

This paper provides a rigorous theoretical justification for modeling stochastic gradient descent with Gaussian noise, clarifying the origin of its implicit regularization effects in machine learning.

Contribution

It introduces a novel theoretical framework explaining how Gaussian noise naturally emerges in stochastic gradient descent, supporting its use in analysis.

Findings

01

Gaussian noise in SGD arises naturally from the dynamics.

02

Theoretical justification for Gaussian approximation in SGD.

03

Supports the implicit regularization role of noise in SGD.

Abstract

In the machine learning literature stochastic gradient descent has recently been widely discussed for its purported implicit regularization properties. Much of the theory, that attempts to clarify the role of noise in stochastic gradient algorithms, has widely approximated stochastic gradient descent by a stochastic differential equation with Gaussian noise. We provide a novel rigorous theoretical justification for this practice that showcases how the Gaussianity of the noise arises naturally.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Mathematical Biology Tumor Growth