Understanding Stochastic Natural Gradient Variational Inference
Kaiwen Wu, Jacob R. Gardner

TL;DR
This paper analyzes the convergence rates of stochastic natural gradient variational inference (NGVI), providing the first non-asymptotic rate for conjugate likelihoods and discussing challenges for non-conjugate cases.
Contribution
It establishes the first $ ext{O}(1/T)$ non-asymptotic convergence rate for stochastic NGVI with conjugate likelihoods and discusses the complexities for non-conjugate likelihoods.
Findings
Proves $ ext{O}(1/T)$ convergence rate for conjugate likelihoods.
Shows stochastic NGVI complexity comparable to stochastic gradient descent.
Highlights challenges in achieving global convergence for non-conjugate likelihoods.
Abstract
Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the \emph{stochastic} setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first non-asymptotic convergence rate of stochastic NGVI. The complexity is no worse than stochastic gradient descent (\aka black-box variational inference) and the rate likely has better constant dependency that leads to faster convergence in practice. For non-conjugate likelihoods, we show that stochastic NGVI with the canonical parameterization implicitly optimizes a non-convex objective. Thus, a global convergence rate of is unlikely without some significant new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques
MethodsVariational Inference
