Equivariance and generalization in neural networks

Srinath Bulusu; Matteo Favoni; Andreas Ipp; David I. M\"uller; Daniel; Schuh

arXiv:2112.12493·hep-lat·February 16, 2022

Equivariance and generalization in neural networks

Srinath Bulusu, Matteo Favoni, Andreas Ipp, David I. M\"uller, Daniel, Schuh

PDF

TL;DR

This paper demonstrates that neural networks incorporating translational equivariance outperform non-equivariant models in physics-related tasks, showing improved performance and generalization across parameters and lattice sizes.

Contribution

It systematically studies the impact of translational equivariance in neural networks applied to high energy physics, highlighting its benefits for performance and generalization.

Findings

01

Equivariant networks outperform non-equivariant ones in various tasks.

02

Equivariant models generalize better to unseen parameters.

03

Performance gains are consistent across different lattice sizes.

Abstract

The crucial role played by the underlying symmetries of high energy physics and lattice field theories calls for the implementation of such symmetries in the neural network architectures that are applied to the physical system under consideration. In these proceedings, we focus on the consequences of incorporating translational equivariance among the network properties, particularly in terms of performance and generalization. The benefits of equivariant networks are exemplified by studying a complex scalar field theory, on which various regression and classification tasks are examined. For a meaningful comparison, promising equivariant and non-equivariant architectures are identified by means of a systematic search. The results indicate that in most of the tasks our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.