The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua, V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton,, Sebastian Nowozin

TL;DR
This paper introduces the k-tied Normal Distribution, a low-rank parameterization of Gaussian mean-field posteriors in Bayesian neural networks, which improves efficiency without sacrificing accuracy.
Contribution
It demonstrates that restricting variational posteriors to a low-rank structure enhances convergence and compactness in Bayesian neural networks.
Findings
Posterior standard deviations exhibit low-rank structure after training.
Low-rank factorization improves gradient estimate signal-to-noise ratio.
Compact parameterization maintains model performance while reducing complexity.
Abstract
Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational distribution to a more compact parameterization. For a variety of deep Bayesian neural networks trained using Gaussian mean-field variational inference, we find that the posterior standard deviations consistently exhibit strong low-rank structure after convergence. This means that by decomposing these variational parameters into a low-rank factorization, we can make our variational approximation more compact without decreasing the models' performance. Furthermore, we find that such factorized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
