The Case for Bayesian Deep Learning

Andrew Gordon Wilson

arXiv:2001.10995·cs.LG·January 30, 2020·66 cites

The Case for Bayesian Deep Learning

Andrew Gordon Wilson

PDF

Open Access

TL;DR

Bayesian deep learning leverages marginalization over parameters to improve model calibration and accuracy, with recent advances enhancing scalability and practical performance.

Contribution

This paper clarifies the core principles of Bayesian deep learning, highlighting the role of marginalization and structured priors, and connects deep ensembles to Bayesian methods.

Findings

01

Deep ensembles approximate Bayesian marginalization.

02

Bayesian methods improve calibration and accuracy.

03

Structured priors in neural networks aid generalization.

Abstract

The key distinguishing property of a Bayesian approach is marginalization instead of optimization, not the prior, or Bayes rule. Bayesian inference is especially compelling for deep neural networks. (1) Neural networks are typically underspecified by the data, and can represent many different but high performing models corresponding to different settings of parameters, which is exactly when marginalization will make the biggest difference for both calibration and accuracy. (2) Deep ensembles have been mistaken as competing approaches to Bayesian methods, but can be seen as approximate Bayesian marginalization. (3) The structure of neural networks gives rise to a structured prior in function space, which reflects the inductive biases of neural networks that help them generalize. (4) The observed correlation between parameters in flat regions of the loss and a diversity of solutions that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Anomaly Detection Techniques and Applications

MethodsDeep Ensembles