Variational Laplace for Bayesian neural networks

Ali Unlu; Laurence Aitchison

arXiv:2011.10443·stat.ML·August 11, 2021

Variational Laplace for Bayesian neural networks

Ali Unlu, Laurence Aitchison

PDF

Open Access

TL;DR

This paper introduces Variational Laplace, a new method for Bayesian neural networks that efficiently estimates the evidence lower bound using a local curvature approximation, leading to improved performance over traditional methods.

Contribution

It proposes Variational Laplace, a simple and effective approximation for Bayesian neural networks that avoids stochastic sampling and improves calibration and accuracy.

Findings

01

Better test performance than MAP and standard VI.

02

Lower expected calibration errors.

03

Avoids early-stopping issues by adjusting learning rates.

Abstract

We develop variational Laplace for Bayesian neural networks (BNNs) which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling of the neural-network weights. The Variational Laplace objective is simple to evaluate, as it is (in essence) the log-likelihood, plus weight-decay, plus a squared-gradient regularizer. Variational Laplace gave better test performance and expected calibration errors than maximum a-posteriori inference and standard sampling-based variational inference, despite using the same variational approximate posterior. Finally, we emphasise care needed in benchmarking standard VI as there is a risk of stopping before the variance parameters have converged. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms

MethodsStochastic Gradient Descent