Explicitly Bayesian Regularizations in Deep Learning

Xinjie Lan; Kenneth E. Barner

arXiv:1910.09732·cs.LG·October 23, 2019·1 cites

Explicitly Bayesian Regularizations in Deep Learning

Xinjie Lan, Kenneth E. Barner

PDF

Open Access

TL;DR

This paper demonstrates explicit Bayesian regularizations in CNNs by representing hidden layers probabilistically, clarifying empirical phenomena, and validating the theory on synthetic data.

Contribution

It introduces a novel probabilistic representation for CNN hidden layers, establishing their Bayesian network structure and explicit regularization.

Findings

01

CNNs correspond to Bayesian networks with serial connections

02

Hidden layers near input form prior distributions

03

Theory validated on synthetic dataset

Abstract

Generalization is essential for deep learning. In contrast to previous works claiming that Deep Neural Networks (DNNs) have an implicit regularization implemented by the stochastic gradient descent, we demonstrate explicitly Bayesian regularizations in a specific category of DNNs, i.e., Convolutional Neural Networks (CNNs). First, we introduce a novel probabilistic representation for the hidden layers of CNNs and demonstrate that CNNs correspond to Bayesian networks with the serial connection. Furthermore, we show that the hidden layers close to the input formulate prior distributions, thus CNNs have explicitly Bayesian regularizations based on the Bayesian regularization theory. In addition, we clarify two recently observed empirical phenomena that are inconsistent with traditional theories of generalization. Finally, we validate the proposed theory on a synthetic dataset

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Machine Learning and Algorithms