The Symmetries of Three-Layer ReLU Networks
Johanna Marie Gegenfurtner, Moritz Grillo, Guido Mont\'ufar

TL;DR
This paper characterizes the symmetries in three-layer ReLU networks, providing explicit descriptions and algorithms for understanding their parameter spaces and implications for gradient flow.
Contribution
It offers a complete, explicit characterization of parameter symmetries in three-layer ReLU networks, including algorithms for functional equivalence.
Findings
Explicit semi-algebraic descriptions of parameter fibers
Polynomial time algorithm for deciding parameter equivalence
Identification of symmetries inducing local conservation laws
Abstract
We develop a framework for analyzing parameter symmetries in deep ReLU networks and obtain a complete characterization of the generic parameter fibers for three-layer bottleneck architectures. Our approach provides explicit semi-algebraic descriptions of these fibers and yields a polynomial time algorithm for deciding functional equivalence of two parameters. The symmetries include discrete and continuous transformations arising from layer composition, and depend on whether deeper layers hide or preserve geometric structure from preceding layers. Finally, we show that some of these symmetries induce local conservation laws along gradient flow, while others do not.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
