Variational Gaussian Dropout is not Bayesian

Jiri Hron; Alexander G. de G. Matthews; Zoubin Ghahramani

arXiv:1711.02989·stat.ML·November 9, 2017·29 cites

Variational Gaussian Dropout is not Bayesian

Jiri Hron, Alexander G. de G. Matthews, Zoubin Ghahramani

PDF

Open Access

TL;DR

This paper critically examines variational Gaussian dropout, revealing that its Bayesian interpretation is flawed due to improper priors and overfitting issues, and offers a non-Bayesian analysis with exact gradient computation.

Contribution

It demonstrates the ill-posed nature of Bayesian inference with log-uniform priors in Gaussian dropout and provides a new analytical form for the objective function.

Findings

01

Log-uniform prior does not induce a proper posterior.

02

Correlated weight noise can lead to infinite objectives or overfitting.

03

Additive reparametrisation introduces new minima.

Abstract

Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks. A recent paper reinterpreted the technique as a specific algorithm for approximate inference in Bayesian neural networks; several extensions ensued. We show that the log-uniform prior used in all the above publications does not generally induce a proper posterior, and thus Bayesian inference in such models is ill-posed. Independent of the log-uniform prior, the correlated weight noise approximation has further issues leading to either infinite objective or high risk of overfitting. The above implies that the reported sparsity of obtained solutions cannot be explained by Bayesian or the related minimum description length arguments. We thus study the objective from a non-Bayesian perspective, provide its previously unknown analytical form which allows exact…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Forecasting Techniques and Applications · Data Stream Mining Techniques