When low-loss paths make a binary neuron trainable: detecting algorithmic transitions with the connected ensemble
Damien Barbier

TL;DR
This paper introduces a statistical-mechanics framework called the connected ensemble to identify trainable low-loss paths in rugged landscapes, applied to the symmetric binary perceptron model, revealing thresholds for easy training and robustness of minima.
Contribution
The paper applies the connected ensemble framework to the SBP model, identifying critical thresholds for the existence of connected minima and analyzing their properties as task difficulty varies.
Findings
Connected minima exist only above a critical threshold.
Training is easier within the connected parameter range.
Minima become more robust and similar as task difficulty increases.
Abstract
We study the connected ensemble, a statistical-mechanics framework that characterizes the formation of low-loss paths in rugged landscapes. First introduced in a previous paper, this ensemble allows one to identify when a network can be trained on a simple task and which minima should be targeted during training. We apply this new framework to the symmetric binary perceptron model (SBP), and study how its typical {connected} minima behave. We show that {connected} minima exist only above a critical threshold , or equivalently below a critical constraint density . This defines a parameter range in which training the network is easy, as local algorithms can efficiently access this connected manifold. We also highlight that these minima become increasingly robust and closer to one another as the task on which the network is trained becomes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Neural dynamics and brain function · Stochastic Gradient Optimization Techniques
