Identifiable Equivariant Networks are Layerwise Equivariant
Vahid Shahverdi, Giovanni Luca Marchetti, Georg B\"okman, Kathl\'en Kohn

TL;DR
This paper proves that for neural networks with end-to-end equivariance, there exists a parameter setting making each layer individually equivariant, explaining how such structures naturally emerge during training.
Contribution
It establishes a formal link between end-to-end and layerwise equivariance under parameter identifiability, applicable across various architectures.
Findings
End-to-end equivariance implies the existence of layerwise equivariance with suitable parameters.
The results are architecture-agnostic and grounded in an abstract formalism.
Provides a mathematical explanation for the emergence of equivariant weights in trained networks.
Abstract
We investigate the relation between end-to-end equivariance and layerwise equivariance in deep neural networks. We prove the following: For a network whose end-to-end function is equivariant with respect to group actions on the input and output spaces, there is a parameter choice yielding the same end-to-end function such that its layers are equivariant with respect to some group actions on the latent spaces. Our result assumes that the parameters of the model are identifiable in an appropriate sense. This identifiability property has been established in the literature for a large class of networks, to which our results apply immediately, while it is conjectural for others. The theory we develop is grounded in an abstract formalism, and is therefore architecture-agnostic. Overall, our results provide a mathematical explanation for the emergence of equivariant structures in the weights…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
