The Lie Derivative for Measuring Learned Equivariance
Nate Gruver, Marc Finzi, Micah Goldblum, Andrew Gordon Wilson

TL;DR
This paper introduces the Lie derivative as a rigorous method to measure equivariance in neural networks, revealing that larger, more accurate models tend to be more equivariant regardless of architecture.
Contribution
It presents a novel, mathematically grounded approach to quantify equivariance in diverse models, enabling large-scale analysis of their symmetry properties.
Findings
Many equivariance violations are due to spatial aliasing in network layers.
Larger and more accurate models tend to exhibit greater equivariance.
Transformers can surpass CNNs in equivariance after training.
Abstract
Equivariance guarantees that a model's predictions capture key symmetries in data. When an image is translated or rotated, an equivariant model's representation of that image will translate or rotate accordingly. The success of convolutional neural networks has historically been tied to translation equivariance directly encoded in their architecture. The rising success of vision transformers, which have no explicit architectural bias towards equivariance, challenges this narrative and suggests that augmentations and training data might also play a significant role in their performance. In order to better understand the role of equivariance in recent vision models, we introduce the Lie derivative, a method for measuring equivariance with strong mathematical foundations and minimal hyperparameters. Using the Lie derivative, we study the equivariance properties of hundreds of pretrained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Machine Learning in Materials Science · Model Reduction and Neural Networks
