The k-tied Normal Distribution: A Compact Parameterization of Gaussian   Mean Field Posteriors in Bayesian Neural Networks

Jakub Swiatkowski; Kevin Roth; Bastiaan S. Veeling; Linh Tran; Joshua; V. Dillon; Jasper Snoek; Stephan Mandt; Tim Salimans; Rodolphe Jenatton,; Sebastian Nowozin

arXiv:2002.02655·cs.LG·July 7, 2020·22 cites

The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks

Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua, V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton,, Sebastian Nowozin

PDF

Open Access 1 Video

TL;DR

This paper introduces the k-tied Normal Distribution, a low-rank parameterization of Gaussian mean-field posteriors in Bayesian neural networks, which improves efficiency without sacrificing accuracy.

Contribution

It demonstrates that restricting variational posteriors to a low-rank structure enhances convergence and compactness in Bayesian neural networks.

Findings

01

Posterior standard deviations exhibit low-rank structure after training.

02

Low-rank factorization improves gradient estimate signal-to-noise ratio.

03

Compact parameterization maintains model performance while reducing complexity.

Abstract

Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work developing this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational distribution to a more compact parameterization. For a variety of deep Bayesian neural networks trained using Gaussian mean-field variational inference, we find that the posterior standard deviations consistently exhibit strong low-rank structure after convergence. This means that by decomposing these variational parameters into a low-rank factorization, we can make our variational approximation more compact without decreasing the models' performance. Furthermore, we find that such factorized…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks· slideslive

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning