The Relevance of Bayesian Layer Positioning to Model Uncertainty in Deep   Bayesian Active Learning

Jiaming Zeng; Adam Lesnikowski; Jose M. Alvarez

arXiv:1811.12535·cs.LG·October 20, 2020·24 cites

The Relevance of Bayesian Layer Positioning to Model Uncertainty in Deep Bayesian Active Learning

Jiaming Zeng, Adam Lesnikowski, Jose M. Alvarez

PDF

Open Access

TL;DR

This paper investigates how the placement and number of Bayesian layers in neural networks affect the ability to model uncertainty, aiming to optimize active learning performance with fewer Bayesian components.

Contribution

It demonstrates that placing a few Bayesian layers near the output can effectively capture uncertainty, reducing computational costs compared to fully Bayesian networks.

Findings

01

Few Bayesian layers near output suffice for uncertainty estimation

02

Optimal layer placement improves active learning efficiency

03

Partial Bayesian networks outperform deterministic models in uncertainty modeling

Abstract

One of the main challenges of deep learning tools is their inability to capture model uncertainty. While Bayesian deep learning can be used to tackle the problem, Bayesian neural networks often require more time and computational power to train than deterministic networks. Our work explores whether fully Bayesian networks are needed to successfully capture model uncertainty. We vary the number and position of Bayesian layers in a network and compare their performance on active learning with the MNIST dataset. We found that we can fully capture the model uncertainty by using only a few Bayesian layers near the output of the network, combining the advantages of deterministic and Bayesian networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning