Bayesian Neural Network Priors Revisited
Vincent Fortuin, Adri\`a Garriga-Alonso, Sebastian W. Ober, Florian, Wenzel, Gunnar R\"atsch, Richard E. Turner, Mark van der Wilk, Laurence, Aitchison

TL;DR
This paper investigates the limitations of isotropic Gaussian priors in Bayesian neural networks and proposes improved priors based on observed weight statistics, leading to better performance and insights into the cold posterior effect.
Contribution
It introduces new priors informed by empirical weight statistics, improving Bayesian neural network performance and understanding of the cold posterior effect.
Findings
CNN and ResNet weights show strong spatial correlations.
FCNN weights exhibit heavy-tailed distributions.
New priors improve classification accuracy and affect the cold posterior effect.
Abstract
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, it is unclear whether these priors accurately reflect our true beliefs about the weight distributions or give optimal performance. To find better priors, we study summary statistics of neural network weights in networks trained using stochastic gradient descent (SGD). We find that convolutional neural network (CNN) and ResNet weights display strong spatial correlations, while fully connected networks (FCNNs) display heavy-tailed weight distributions. We show that building these observations into priors can lead to improved performance on a variety of image classification datasets. Surprisingly, these priors mitigate the cold posterior effect in FCNNs, but slightly increase the cold posterior effect in ResNets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Machine Learning and Data Classification · Gaussian Processes and Bayesian Inference
MethodsStochastic Gradient Descent
