Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam

Mohammad Emtiyaz Khan; Didrik Nielsen; Voot Tangkaratt; Wu Lin; Yarin; Gal; Akash Srivastava

arXiv:1806.04854·stat.ML·August 3, 2018·60 cites

Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam

Mohammad Emtiyaz Khan, Didrik Nielsen, Voot Tangkaratt, Wu Lin, Yarin, Gal, Akash Srivastava

PDF

Open Access 3 Repos

TL;DR

This paper introduces a new natural-gradient algorithm for Bayesian deep learning that integrates weight perturbations into Adam, enabling efficient uncertainty estimation with less computational effort.

Contribution

The authors develop a novel weight-perturbation method within Adam for Gaussian mean-field variational inference, reducing complexity and resource requirements.

Findings

01

Achieves uncertainty estimates comparable to existing VI methods.

02

Requires less memory and computation than traditional VI approaches.

03

Potential applications in reinforcement learning and stochastic optimization.

Abstract

Uncertainty computation in deep learning is essential to design robust and reliable systems. Variational inference (VI) is a promising approach for such computation, but requires more effort to implement and execute compared to maximum-likelihood methods. In this paper, we propose new natural-gradient algorithms to reduce such efforts for Gaussian mean-field VI. Our algorithms can be implemented within the Adam optimizer by perturbing the network weights during gradient evaluations, and uncertainty estimates can be cheaply obtained by using the vector that adapts the learning rate. This requires lower memory, computation, and implementation effort than existing VI methods, while obtaining uncertainty estimates of comparable quality. Our empirical results confirm this and further suggest that the weight-perturbation in our algorithm could be useful for exploration in reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms

MethodsAdam