Capacity of the treelike sign perceptrons neural networks with one hidden layer -- RDT based upper bounds
Mihailo Stojnic

TL;DR
This paper rigorously determines the capacity bounds of treelike sign perceptron neural networks with one hidden layer using Random Duality Theory, confirming predictions from statistical physics and improving previous bounds for small network sizes.
Contribution
It introduces a mathematical framework to establish upper bounds on the capacity of 1-hidden layer treelike sign perceptrons, matching physics predictions and improving known bounds for small networks.
Findings
Exact capacity bounds match replica symmetry predictions.
Improved bounds for networks with up to 5 neurons.
First rigorous progress in over 30 years for small network capacities.
Abstract
We study the capacity of \emph{sign} perceptrons neural networks (SPNN) and particularly focus on 1-hidden layer \emph{treelike committee machine} (TCM) architectures. Similarly to what happens in the case of a single perceptron neuron, it turns out that, in a statistical sense, the capacity of a corresponding multilayered network architecture consisting of multiple \emph{sign} perceptrons also undergoes the so-called phase transition (PT) phenomenon. This means: (i) for certain range of system parameters (size of data, number of neurons), the network can be properly trained to accurately memorize \emph{all} elements of the input dataset; and (ii) outside the region such a training does not exist. Clearly, determining the corresponding phase transition curve that separates these regions is an extraordinary task and among the most fundamental questions related to the performance of any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Applications · Statistical Mechanics and Entropy
MethodsFocus
