How Good is the Bayes Posterior in Deep Neural Networks Really?
Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub, \'Swi\k{a}tkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans,, Rodolphe Jenatton, Sebastian Nowozin

TL;DR
This paper critically examines the effectiveness of Bayesian posteriors in deep neural networks, revealing that they often underperform compared to simpler methods and that 'cold posteriors' improve predictions but deviate from Bayesian principles.
Contribution
The study demonstrates that true Bayesian posteriors in deep neural networks are often worse than point estimates and explores the reasons behind the success of cold posteriors, challenging current assumptions.
Findings
Bayesian posteriors can yield worse predictions than point estimates.
Cold posteriors improve predictive performance significantly.
Cold posteriors deviate from the Bayesian paradigm but are heuristically effective.
Abstract
During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a "cold posterior" that overcounts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
MethodsStochastic Gradient Descent
