Practical Deep Learning with Bayesian Principles
Kazuki Osawa, Siddharth Swaroop, Anirudh Jain, Runa Eschenhagen,, Richard E. Turner, Rio Yokota, Mohammad Emtiyaz Khan

TL;DR
This paper presents a practical approach to deep learning using Bayesian principles through natural-gradient variational inference, achieving competitive performance and improved uncertainty estimation on large datasets like ImageNet.
Contribution
It introduces a practical training method combining Bayesian principles with deep networks, utilizing techniques like batch normalization and distributed training.
Findings
Achieves performance comparable to Adam optimizer on ImageNet
Predictive probabilities are well-calibrated and uncertainties are improved
Enhances continual learning capabilities
Abstract
Bayesian methods promise to fix many shortcomings of deep learning, but they are impractical and rarely match the performance of standard methods, let alone improve them. In this paper, we demonstrate practical training of deep networks with natural-gradient variational inference. By applying techniques such as batch normalisation, data augmentation, and distributed training, we achieve similar performance in about the same number of epochs as the Adam optimiser, even on large datasets such as ImageNet. Importantly, the benefits of Bayesian principles are preserved: predictive probabilities are well-calibrated, uncertainties on out-of-distribution data are improved, and continual-learning performance is boosted. This work enables practical deep learning while preserving benefits of Bayesian principles. A PyTorch implementation is available as a plug-and-play optimiser.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Gaussian Processes and Bayesian Inference · Advanced Neural Network Applications
MethodsAdam
