Symmetry and Generalisation in Machine Learning

Hayder Elesedy

arXiv:2501.03858·cs.LG·January 8, 2025

Symmetry and Generalisation in Machine Learning

Hayder Elesedy

PDF

Open Access

TL;DR

This paper rigorously demonstrates that symmetry, through invariance and equivariance, enhances generalisation in supervised learning by reducing test risk, supported by theoretical proofs and applications to regression models.

Contribution

It provides a formal proof that symmetry acts as a beneficial inductive bias and introduces an averaging operator approach to analyze equivariant predictors.

Findings

01

Symmetry improves test risk in supervised learning.

02

Averaging operator effectively analyzes equivariant predictors.

03

Invariant models simplify learning by focusing on orbit representatives.

Abstract

This work is about understanding the impact of invariance and equivariance on generalisation in supervised learning. We use the perspective afforded by an averaging operator to show that for any predictor that is not equivariant, there is an equivariant predictor with strictly lower test risk on all regression problems where the equivariance is correctly specified. This constitutes a rigorous proof that symmetry, in the form of invariance or equivariance, is a useful inductive bias. We apply these ideas to equivariance and invariance in random design least squares and kernel ridge regression respectively. This allows us to specify the reduction in expected test risk in more concrete settings and express it in terms of properties of the group, the model and the data. Along the way, we give examples and additional results to demonstrate the utility of the averaging operator approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsADaptive gradient method with the OPTimal convergence rate