What Are Bayesian Neural Network Posteriors Really Like?

Pavel Izmailov; Sharad Vikram; Matthew D. Hoffman; Andrew Gordon; Wilson

arXiv:2104.14421·cs.LG·April 30, 2021·71 cites

What Are Bayesian Neural Network Posteriors Really Like?

Pavel Izmailov, Sharad Vikram, Matthew D. Hoffman, Andrew Gordon, Wilson

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper uses full-batch Hamiltonian Monte Carlo to analyze Bayesian neural network posteriors, revealing insights into their performance, robustness, and differences from approximate methods, challenging some recent assumptions.

Contribution

It demonstrates the effectiveness of HMC in capturing BNN posteriors, compares it with other methods, and investigates properties like posterior tempering and generalization under domain shift.

Findings

01

BNNs with HMC outperform standard training and deep ensembles.

02

A single long HMC chain suffices for posterior approximation.

03

Posterior tempering is unnecessary for near-optimal performance.

Abstract

The posterior over Bayesian neural network (BNN) parameters is extremely high-dimensional and non-convex. For computational reasons, researchers approximate this posterior using inexpensive mini-batch methods such as mean-field variational inference or stochastic-gradient Markov chain Monte Carlo (SGMCMC). To investigate foundational questions in Bayesian deep learning, we instead use full-batch Hamiltonian Monte Carlo (HMC) on modern architectures. We show that (1) BNNs can achieve significant performance gains over standard training and deep ensembles; (2) a single long HMC chain can provide a comparable representation of the posterior to multiple shorter chains; (3) in contrast to recent studies, we find posterior tempering is not needed for near-optimal performance, with little evidence for a "cold posterior" effect, which we show is largely an artifact of data augmentation; (4) BMA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

What Are Bayesian Neural Network Posteriors Really Like?· slideslive

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods · Adversarial Robustness in Machine Learning

MethodsVariational Inference · Deep Ensembles