On the Ability of Deep Networks to Learn Symmetries from Data: A Neural Kernel Theory
Andrea Perin, Stephane Deny

TL;DR
This paper develops a neural kernel theory to analyze how deep networks learn symmetries from data, revealing conditions under which generalization succeeds or fails, especially when symmetries are only partially observed.
Contribution
It introduces a theoretical framework for understanding symmetry learning in neural networks, extending to finite and non-abelian groups, and connects kernel analysis with empirical observations.
Findings
Kernel theory predicts generalization failure when symmetry structure is weak.
Equivariant architectures match data symmetries and generalize better.
Finite-width networks struggle to learn symmetries not embedded in architecture.
Abstract
Symmetries (transformations by group actions) are present in many datasets, and leveraging them holds considerable promise for improving predictions in machine learning. In this work, we aim to understand when and how deep networks -- with standard architectures trained in a standard, supervised way -- learn symmetries from data. Inspired by real-world scenarios, we study a classification paradigm where data symmetries are only partially observed during training: some classes include all transformations of a cyclic group, while others -- only a subset. In the infinite-width limit, where kernel analogies apply, we derive a neural kernel theory of symmetry learning. The group-cyclic nature of the dataset allows us to analyze the Gram matrix of neural kernels in the Fourier domain; here we find a simple characterization of the generalization error as a function of class separation (signal)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsFocus
