Kernels and Submodels of Deep Belief Networks
Guido F. Montufar, Jason Morton

TL;DR
This paper analyzes the mathematical structure of Deep Belief Networks (DBNs), focusing on kernels and submodels, to understand their representational capabilities and approximation errors.
Contribution
It provides a unified kernel-based framework for understanding DBNs, describing their geometric properties, and bounding their approximation errors.
Findings
Characterization of kernels and their products in DBNs
Explicit classes of distributions learnable by DBNs
Bounds on approximation errors based on network size
Abstract
We study the mixtures of factorizing probability distributions represented as visible marginal distributions in stochastic layered networks. We take the perspective of kernel transitions of distributions, which gives a unified picture of distributed representations arising from Deep Belief Networks (DBN) and other networks without lateral connections. We describe combinatorial and geometric properties of the set of kernels and products of kernels realizable by DBNs as the network parameters vary. We describe explicit classes of probability distributions, including exponential families, that can be learned by DBNs. We use these submodels to bound the maximal and the expected Kullback-Leibler approximation errors of DBNs from above depending on the number of hidden layers and units that they contain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Neural Networks and Applications
