# Manifold Optimization Assisted Gaussian Variational Approximation

**Authors:** Bingxin Zhou, Junbin Gao, Minh-Ngoc Tran, Richard Gerlach

arXiv: 1902.03718 · 2021-04-07

## TL;DR

This paper introduces manifold optimization techniques to improve Gaussian variational approximation in Bayesian inference, addressing covariance structure constraints and enhancing optimization stability and efficiency.

## Contribution

It proposes novel Riemannian stochastic gradient descent schemes on Stiefel and Grassmann manifolds for Gaussian variational approximation, overcoming covariance constraint issues.

## Key findings

- Achieves competitive accuracy in high-dimensional Bayesian tasks.
- Demonstrates faster convergence with the proposed methods.
- Maintains stability and robustness during optimization.

## Abstract

Gaussian variational approximation is a popular methodology to approximate posterior distributions in Bayesian inference especially in high dimensional and large data settings. To control the computational cost while being able to capture the correlations among the variables, the low rank plus diagonal structure was introduced in the previous literature for the Gaussian covariance matrix. For a specific Bayesian learning task, the uniqueness of the solution is usually ensured by imposing stringent constraints on the parameterized covariance matrix, which could break down during the optimization process. In this paper, we consider two special covariance structures by applying the Stiefel manifold and Grassmann manifold constraints, to address the optimization difficulty in such factorization architectures. To speed up the updating process with minimum hyperparameter-tuning efforts, we design two new schemes of Riemannian stochastic gradient descent methods and compare them with other existing methods of optimizing on manifolds. In addition to fixing the identification issue, results from both simulation and empirical experiments prove the ability of the proposed methods of obtaining competitive accuracy and comparable converge speed in both high-dimensional and large-scale learning tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.03718/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/1902.03718/full.md

## References

60 references — full list in the complete paper: https://tomesphere.com/paper/1902.03718/full.md

---
Source: https://tomesphere.com/paper/1902.03718