Convergence of Multi-Level Markov Chain Monte Carlo Adaptive Stochastic Gradient Algorithms
Antoine Godichon-Baggioni (LPSM ), Gabriel Lang (MIA Paris-Saclay), Sylvain Le Corff, Julien Stoehr (CEREMADE), Sobihan Surendran

TL;DR
This paper introduces a multilevel Monte Carlo gradient estimator that reduces bias efficiently and integrates it into adaptive stochastic gradient algorithms, improving convergence in complex models like autoencoders.
Contribution
It proposes a novel multilevel MCMC gradient estimator with bias decay and low computational cost, and develops new multilevel adaptive gradient algorithms with proven convergence rates.
Findings
Bias decays as O(T_n^{-1}) with logarithmic cost growth
New multilevel variants of Adagrad and AMSGrad are developed
Convergence rate of O(n^{-1/2}) up to logarithmic factors
Abstract
Stochastic optimization in learning and inference often relies on Markov chain Monte Carlo (MCMC) to approximate gradients when exact computation is intractable. However, finite-time MCMC estimators are biased, and reducing this bias typically comes at a higher computational cost. We propose a multilevel Monte Carlo gradient estimator whose bias decays as while its expected computational cost grows only as , where is the maximal truncation level at iteration n. Building on this approach, we introduce a multilevel MCMC framework for adaptive stochastic gradient methods, leading to new multilevel variants of Adagrad and AMSGrad algorithms. Under conditions controlling the estimator bias and its second and third moments, we establish a convergence rate of order up to logarithmic factors. Finally, we illustrate these results on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference
