Most ReLU Networks Admit Identifiable Parameters
Moritz Grillo, Guido Mont\'ufar

TL;DR
This paper proves that most ReLU networks have identifiable parameters up to scaling and permutation, with a clear relationship between architecture width, depth, and functional representation.
Contribution
It introduces a framework based on weighted polyhedral complexes to analyze parameter redundancies and establishes a depth hierarchy for ReLU networks.
Findings
Most architectures with width ≥ 2 have an open set of identifiable parameters.
The functional dimension equals total parameters minus hidden neurons.
Deeper networks can represent functions that shallower ones cannot, generically.
Abstract
We study the realization map of deep ReLU networks, focusing on when a function determines its parameters up to scaling and permutation. To analyze hidden redundancies beyond these standard symmetries, we introduce a framework based on weighted polyhedral complexes. Our main result shows that for every architecture whose input and hidden layers have width at least two, there exists an open set of identifiable parameters. This implies that the functional dimension of every such architecture is exactly the number of parameters minus the number of hidden neurons. We further show that minimal functional representations can still have non-trivial parameter redundancies. Finally, we establish a generic depth hierarchy, whereby for an open set of parameters the realized function cannot be represented generically by any shallower network.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
