Understanding Stochastic Natural Gradient Variational Inference

Kaiwen Wu; Jacob R. Gardner

arXiv:2406.01870·cs.LG·June 5, 2024

Understanding Stochastic Natural Gradient Variational Inference

Kaiwen Wu, Jacob R. Gardner

PDF

Open Access

TL;DR

This paper analyzes the convergence rates of stochastic natural gradient variational inference (NGVI), providing the first non-asymptotic rate for conjugate likelihoods and discussing challenges for non-conjugate cases.

Contribution

It establishes the first $ ext{O}(1/T)$ non-asymptotic convergence rate for stochastic NGVI with conjugate likelihoods and discusses the complexities for non-conjugate likelihoods.

Findings

01

Proves $ ext{O}(1/T)$ convergence rate for conjugate likelihoods.

02

Shows stochastic NGVI complexity comparable to stochastic gradient descent.

03

Highlights challenges in achieving global convergence for non-conjugate likelihoods.

Abstract

Stochastic natural gradient variational inference (NGVI) is a popular posterior inference method with applications in various probabilistic models. Despite its wide usage, little is known about the non-asymptotic convergence rate in the \emph{stochastic} setting. We aim to lessen this gap and provide a better understanding. For conjugate likelihoods, we prove the first $O (\frac{1}{T})$ non-asymptotic convergence rate of stochastic NGVI. The complexity is no worse than stochastic gradient descent (\aka black-box variational inference) and the rate likely has better constant dependency that leads to faster convergence in practice. For non-conjugate likelihoods, we show that stochastic NGVI with the canonical parameterization implicitly optimizes a non-convex objective. Thus, a global convergence rate of $O (\frac{1}{T})$ is unlikely without some significant new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques

MethodsVariational Inference