Convergence of projected stochastic natural gradient variational inference for various step size and sample or batch size schedules

Thomas Guilmeau; Hadrien Hendrikx; Florence Forbes

arXiv:2604.00683·stat.ME·April 2, 2026

Convergence of projected stochastic natural gradient variational inference for various step size and sample or batch size schedules

Thomas Guilmeau, Hadrien Hendrikx, Florence Forbes

PDF

TL;DR

This paper provides a comprehensive theoretical analysis of the convergence properties of stochastic natural gradient variational inference (NGVI) under various step size and sample size schedules, extending existing results.

Contribution

It establishes new non-asymptotic convergence rates for projected stochastic NGVI with different hyperparameter schedules, including geometric and polynomial rates.

Findings

01

NGVI converges geometrically to a neighborhood of the optimum with fixed hyperparameters.

02

Convergence to the exact optimum occurs at rates of O(1/T^ρ) for other hyperparameter schedules.

03

The results apply when the target distribution is close to the exponential family considered.

Abstract

Stochastic natural gradient variational inference (NGVI) is a popular and efficient algorithm for Bayesian inference. Despite empirical success, the convergence of this method is still not fully understood. In this work, we define and study a projected stochastic NGVI when variational distributions form an exponential family. Stochasticity arises when either gradients are intractable expectations or large sums. We prove new non-asymptotic convergence results for combinations of constant or decreasing step sizes and constant or increasing sample/batch sizes. When all hyperparameters are fixed, NGVI is shown to converge geometrically to a neighborhood of the optimum, while we establish convergence to the optimum with rates of the form $O (\frac{1}{T ^{ρ}})$ , possibly with $ρ \geq 1$ , for all other combinations of step size and sample/batch size schedules. These…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.