Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization
Yuri Kinoshita, Taiji Suzuki

TL;DR
This paper introduces variance-reduced stochastic gradient Langevin dynamics algorithms with improved convergence rates under weaker assumptions, enhancing sampling and non-convex optimization efficiency.
Contribution
The paper proves convergence of two variance-reduced Langevin algorithms under weaker conditions and derives improved gradient complexity bounds for achieving $ ext{epsilon}$-precision.
Findings
Convergence to the target distribution under weaker assumptions.
Gradient complexity improved to $ ilde{O}((n+dn^{1/2} ext{epsilon}^{-1}) ext{gamma}^2 L^2 ext{alpha}^{-2})$.
Applications demonstrated in non-convex optimization.
Abstract
The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve sampling problems and non-convex optimization appearing in several machine learning applications. Especially, its variance reduced versions have nowadays gained particular attention. In this paper, we study two variants of this kind, namely, the Stochastic Variance Reduced Gradient Langevin Dynamics and the Stochastic Recursive Gradient Langevin Dynamics. We prove their convergence to the objective distribution in terms of KL-divergence under the sole assumptions of smoothness and Log-Sobolev inequality which are weaker conditions than those used in prior works for these algorithms. With the batch size and the inner loop length set to , the gradient complexity to achieve an -precision is , which is an improvement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Sparse and Compressive Sensing Techniques
