Symmetry and Generalisation in Machine Learning
Hayder Elesedy

TL;DR
This paper rigorously demonstrates that symmetry, through invariance and equivariance, enhances generalisation in supervised learning by reducing test risk, supported by theoretical proofs and applications to regression models.
Contribution
It provides a formal proof that symmetry acts as a beneficial inductive bias and introduces an averaging operator approach to analyze equivariant predictors.
Findings
Symmetry improves test risk in supervised learning.
Averaging operator effectively analyzes equivariant predictors.
Invariant models simplify learning by focusing on orbit representatives.
Abstract
This work is about understanding the impact of invariance and equivariance on generalisation in supervised learning. We use the perspective afforded by an averaging operator to show that for any predictor that is not equivariant, there is an equivariant predictor with strictly lower test risk on all regression problems where the equivariance is correctly specified. This constitutes a rigorous proof that symmetry, in the form of invariance or equivariance, is a useful inductive bias. We apply these ideas to equivariance and invariance in random design least squares and kernel ridge regression respectively. This allows us to specify the reduction in expected test risk in more concrete settings and express it in terms of properties of the group, the model and the data. Along the way, we give examples and additional results to demonstrate the utility of the averaging operator approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsADaptive gradient method with the OPTimal convergence rate
