On the High Symmetry of Neural Network Functions
Umberto Michelucci

TL;DR
This paper explores the high symmetry in neural network functions, revealing that the number of equivalent minima grows factorially with network size, impacting understanding of training convergence.
Contribution
It provides the first rigorous mathematical analysis and estimates of the factorial growth of equivalent minima in neural networks due to their symmetry.
Findings
Number of equivalent minima grows factorially with neurons and filters.
Symmetry causes multiple identical minima in parameter space.
Implications for neural network training and convergence studies.
Abstract
Training neural networks means solving a high-dimensional optimization problem. Normally the goal is to minimize a loss function that depends on what is called the network function, or in other words the function that gives the network output given a certain input. This function depends on a large number of parameters, also known as weights, that depends on the network architecture. In general the goal of this optimization problem is to find the global minimum of the network function. In this paper it is discussed how due to how neural networks are designed, the neural network function present a very large symmetry in the parameter space. This work shows how the neural network function has a number of equivalent minima, in other words minima that give the same value for the loss function and the same exact output, that grows factorially with the number of neurons in each layer for feed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
