Symmetries in Overparametrized Neural Networks: A Mean-Field View

Javier Maass; Joaquin Fontbona

arXiv:2405.19995·stat.ML·May 27, 2025

Symmetries in Overparametrized Neural Networks: A Mean-Field View

Javier Maass, Joaquin Fontbona

PDF

Open Access

TL;DR

This paper develops a mean-field framework to analyze the learning dynamics of overparametrized neural networks with symmetry considerations, revealing how data augmentation, feature averaging, and equivariant architectures influence training and generalization.

Contribution

It introduces a novel mean-field perspective incorporating symmetry laws, providing insights into the dynamics of symmetric neural networks and their invariant distributions during training.

Findings

01

Symmetric models follow Wasserstein gradient flows in the mean-field limit.

02

Data augmentation and feature averaging lead to identical mean-field dynamics under symmetric data.

03

Invariant laws are preserved during training, contrasting finite network behavior.

Abstract

We develop a Mean-Field (MF) view of the learning dynamics of overparametrized Artificial Neural Networks (NN) under data symmetric in law wrt the action of a general compact group $G$ . We consider for this a class of generalized shallow NNs given by an ensemble of $N$ multi-layer units, jointly trained using stochastic gradient descent (SGD) and possibly symmetry-leveraging (SL) techniques, such as Data Augmentation (DA), Feature Averaging (FA) or Equivariant Architectures (EA). We introduce the notions of weakly and strongly invariant laws (WI and SI) on the parameter space of each single unit, corresponding, respectively, to $G$ -invariant distributions, and to distributions supported on parameters fixed by the group action (which encode EA). This allows us to define symmetric models compatible with taking $N \to \infty$ and give an interpretation of the asymptotic dynamics of DA, FA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsSparse Evolutionary Training · Feedback Alignment · Stochastic Gradient Descent