Solution space and storage capacity of fully connected two-layer neural networks with generic activation functions
Sota Nishiyama, Masayuki Ohzeki

TL;DR
This paper investigates the solution space and storage capacity of fully connected two-layer neural networks with various activation functions, revealing finite capacity per parameter, weight correlations, and phase transitions affecting learnability.
Contribution
It provides a theoretical analysis of how activation functions and network size influence storage capacity and solution space structure using the replica method.
Findings
Storage capacity per parameter remains finite with infinite width.
Weights exhibit negative correlations, indicating a 'division of labor'.
A phase transition occurs at a certain dataset size, breaking permutation symmetry.
Abstract
The storage capacity of a binary classification model is the maximum number of random input-output pairs per parameter that the model can learn. It is one of the indicators of the expressive power of machine learning models and is important for comparing the performance of various models. In this study, we analyze the structure of the solution space and the storage capacity of fully connected two-layer neural networks with general activation functions using the replica method from statistical physics. Our results demonstrate that the storage capacity per parameter remains finite even with infinite width and that the weights of the network exhibit negative correlations, leading to a 'division of labor'. In addition, we find that increasing the dataset size triggers a phase transition at a certain transition point where the permutation symmetry of weights is broken, resulting in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
