Subspace Inference for Bayesian Deep Learning
Pavel Izmailov, Wesley J. Maddox, Polina Kirichenko, Timur Garipov,, Dmitry Vetrov, Andrew Gordon Wilson

TL;DR
This paper introduces a subspace inference approach for Bayesian deep learning, enabling scalable Bayesian inference in neural networks by focusing on low-dimensional parameter subspaces derived from SGD trajectories.
Contribution
It proposes constructing low-dimensional subspaces of neural network parameters, such as principal components of SGD trajectories, to facilitate Bayesian inference methods like elliptical slice sampling and variational inference.
Findings
Bayesian model averaging in subspaces yields accurate predictions.
The method produces well-calibrated uncertainty estimates.
Applicable to both regression and image classification tasks.
Abstract
Bayesian inference was once a gold standard for learning with neural networks, providing accurate full predictive distributions and well calibrated uncertainty. However, scaling Bayesian inference techniques to deep neural networks is challenging due to the high dimensionality of the parameter space. In this paper, we construct low-dimensional subspaces of parameter space, such as the first principal components of the stochastic gradient descent (SGD) trajectory, which contain diverse sets of high performing models. In these subspaces, we are able to apply elliptical slice sampling and variational inference, which struggle in the full parameter space. We show that Bayesian model averaging over the induced posterior in these subspaces produces accurate predictions and well calibrated predictive uncertainty for both regression and image classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
