Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors
Christos Louizos, Max Welling

TL;DR
This paper proposes a variational Bayesian neural network using matrix variate Gaussian posteriors to model correlations efficiently, connecting deep learning with Gaussian processes and enabling scalable sampling.
Contribution
It introduces a novel matrix Gaussian posterior for Bayesian neural networks, linking them to Gaussian processes and improving sampling efficiency.
Findings
Efficient modeling of input-output correlations in neural networks.
Connection established between variational networks and Gaussian processes.
Enhanced sampling methods maintaining model properties.
Abstract
We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian \cite{gupta1999matrix} parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the "local reprarametrization trick" \cite{kingma2015variational} on this posterior distribution we arrive at a Gaussian Process \cite{rasmussen2006gaussian} interpretation of the hidden units in each layer and we, similarly with \cite{gal2015dropout}, provide connections with deep Gaussian processes. We continue in taking advantage of this duality and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models · Domain Adaptation and Few-Shot Learning
