The promises and pitfalls of Stochastic Gradient Langevin Dynamics

Nicolas Brosse; Alain Durmus; Eric Moulines

arXiv:1811.10072·stat.ML·November 27, 2018·47 cites

The promises and pitfalls of Stochastic Gradient Langevin Dynamics

Nicolas Brosse, Alain Durmus, Eric Moulines

PDF

Open Access

TL;DR

This paper analyzes the behavior of Stochastic Gradient Langevin Dynamics (SGLD) in large datasets, revealing its limitations and proposing a variance reduction method, SGLDFP, that improves sampling accuracy with lower computational cost.

Contribution

The paper provides a detailed theoretical analysis of SGLD's invariant distribution and introduces SGLDFP, a variance reduction technique that enhances sampling accuracy efficiently.

Findings

01

SGLD's invariant measure diverges from the true posterior as dataset size grows.

02

SGLDFP achieves approximate posterior sampling with sublinear computational cost.

03

Explicit Wasserstein distance bounds between SGLD variants and Langevin Monte Carlo.

Abstract

Stochastic Gradient Langevin Dynamics (SGLD) has emerged as a key MCMC algorithm for Bayesian learning from large scale datasets. While SGLD with decreasing step sizes converges weakly to the posterior distribution, the algorithm is often used with a constant step size in practice and has demonstrated successes in machine learning tasks. The current practice is to set the step size inversely proportional to $N$ where $N$ is the number of training samples. As $N$ becomes large, we show that the SGLD algorithm has an invariant probability measure which significantly departs from the target posterior and behaves like Stochastic Gradient Descent (SGD). This difference is inherently due to the high variance of the stochastic gradients. Several strategies have been suggested to reduce this effect; among them, SGLD Fixed Point (SGLDFP) uses carefully designed control variates to reduce the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsQuantum many-body systems · Advanced Thermodynamics and Statistical Mechanics · Quantum Computing Algorithms and Architecture

MethodsStochastic Gradient Descent