Bayesian Deep Learning and a Probabilistic Perspective of Generalization

Andrew Gordon Wilson; Pavel Izmailov

arXiv:2002.08791·cs.LG·March 31, 2022·182 cites

Bayesian Deep Learning and a Probabilistic Perspective of Generalization

Andrew Gordon Wilson, Pavel Izmailov

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores Bayesian deep learning, emphasizing marginalization over weights for better generalization, and introduces methods like deep ensembles to approximate Bayesian inference, explaining neural network behavior from a probabilistic standpoint.

Contribution

It demonstrates how Bayesian marginalization improves neural network calibration and accuracy, and provides a probabilistic explanation for neural network generalization phenomena.

Findings

01

Deep ensembles effectively approximate Bayesian marginalization.

02

Bayesian model averaging alleviates double descent.

03

Gaussian processes replicate neural network generalization results.

Abstract

The key distinguishing property of a Bayesian approach is marginalization, rather than using a single setting of weights. Bayesian marginalization can particularly improve the accuracy and calibration of modern deep neural networks, which are typically underspecified by the data, and can represent many compelling but different solutions. We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization, and propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction, without significant overhead. We also investigate the prior over functions implied by a vague distribution over neural network weights, explaining the generalization properties of such models from a probabilistic perspective. From this perspective, we explain results that have been presented as mysterious and distinct to neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

izmailovpavel/understandingbdl
pytorchOfficial

Videos

Bayesian Deep Learning and a Probabilistic Perspective of Generalization· slideslive

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsDeep Ensembles