No Free Lunch for Stochastic Gradient Langevin Dynamics
Natesh S. Pillai, Aaron Smith, Azeem Zaman

TL;DR
This paper demonstrates that stochastic gradient Langevin dynamics often fails to improve efficiency with smaller data subsamples, as the accuracy loss offsets the speed gains, challenging its scalability benefits.
Contribution
It provides a theoretical analysis showing the limitations of SGLD with data subsampling, contrasting previous optimistic assumptions.
Findings
Smaller subsamples reduce per-step computation but decrease accuracy.
Accuracy loss offsets speed gains in typical datasets.
SGLD's scalability benefits are limited by this accuracy-speed trade-off.
Abstract
As sample sizes grow, scalability has become a central concern in the development of Markov chain Monte Carlo (MCMC) methods. One general approach to this problem, exemplified by the popular stochastic gradient Langevin dynamics (SGLD) algorithm, is to use a small random subsample of the data at every time step. This paper, building on recent work such as \cite{nagapetyan2017true,JohndrowJamesE2020NFLf}, shows that this approach often fails: while decreasing the sample size increases the speed of each MCMC step, for typical datasets this is balanced by a matching decrease in accuracy. This result complements recent work such as \cite{nagapetyan2017true} (which came to the same conclusion, but analyzed only specific upper bounds on errors rather than actual errors) and \cite{JohndrowJamesE2020NFLf} (which did not analyze nonreversible algorithms and allowed for logarithmic improvements).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Mathematical Biology Tumor Growth · Theoretical and Computational Physics
