Bayesian Layers: A Module for Neural Network Uncertainty

Dustin Tran; Michael W. Dusenberry; Mark van der Wilk and; Danijar Hafner

arXiv:1812.03973·cs.LG·March 7, 2019·49 cites

Bayesian Layers: A Module for Neural Network Uncertainty

Dustin Tran, Michael W. Dusenberry, Mark van der Wilk and, Danijar Hafner

PDF

Open Access 1 Repo

TL;DR

Bayesian Layers provides a flexible module for integrating uncertainty into neural networks, supporting various stochastic components and enabling scalable experimentation with models like Bayesian Transformers and deep GPs.

Contribution

It introduces a unified, extensible module for neural network uncertainty, compatible with existing libraries and capable of scaling to large models and complex probabilistic architectures.

Findings

01

Successfully trained a 5-billion parameter Bayesian Transformer on TPUv2

02

Demonstrated Bayesian Layers in diverse architectures like LSTMs and GPs

03

Integrated Bayesian Layers with Edward2 for probabilistic programming

Abstract

We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with drop-in replacements for common layers. This enables composition via a unified abstraction over deterministic and stochastic functions and allows for scalability via the underlying system. These layers capture uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations ("stochastic output layers"), or the function itself (Gaussian processes). They can also be reversible to propagate uncertainty from input to output. We include code examples for common architectures such as Bayesian LSTMs, deep GPs, and flow-based models. As demonstration, we fit a 5-billion parameter "Bayesian Transformer" on 512 TPUv2 cores for uncertainty in machine translation and a Bayesian dynamics model for model-based planning. Finally,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google/edward2
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Anomaly Detection Techniques and Applications · Model Reduction and Neural Networks