Hierarchical Models as Marginals of Hierarchical Models

Guido Montufar; Johannes Rauh

arXiv:1508.03606·math.PR·March 8, 2016

Hierarchical Models as Marginals of Hierarchical Models

Guido Montufar, Johannes Rauh

PDF

TL;DR

This paper explores how hierarchical models can be represented as marginals of simpler hierarchical models, demonstrating that certain neural network structures can efficiently approximate complex distributions of binary variables.

Contribution

It introduces a novel representation of hierarchical models as marginals, generalizes previous results, and improves bounds on the number of hidden units needed for approximation.

Findings

01

Every hidden variable can model multiple interactions among visible variables.

02

A restricted Boltzmann machine can approximate any distribution with fewer hidden units than previously known.

03

The representation links hierarchical models to neural networks with soft-plus units.

Abstract

We investigate the representation of hierarchical models in terms of marginals of other hierarchical models with smaller interactions. We focus on binary variables and marginals of pairwise interaction models whose hidden variables are conditionally independent given the visible variables. In this case the problem is equivalent to the representation of linear subspaces of polynomials by feedforward neural networks with soft-plus computational units. We show that every hidden variable can freely model multiple interactions among the visible variables, which allows us to generalize and improve previous results. In particular, we show that a restricted Boltzmann machine with less than $[2 (lo g (v) + 1) / (v + 1)] 2^{v} - 1$ hidden binary variables can approximate every distribution of $v$ visible binary variables arbitrarily well, compared to $2^{v - 1} - 1$ from the best previously known result.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRestricted Boltzmann Machine