Training Dynamics of Learning 3D-Rotational Equivariance
Max W. Shen, Ewa Nowara, Michael Maser, Kyunghyun Cho

TL;DR
This paper studies how quickly models learn 3D-rotation equivariance, showing they do so rapidly and that learning equivariance is easier than the main task, with implications for model efficiency.
Contribution
It introduces a measure of equivariance error, empirically analyzes learning dynamics of 3D-rotation equivariance, and compares efficiency of equivariant versus non-equivariant models.
Findings
Models reduce equivariance error to ≤2% within 1k-10k steps.
Learning 3D-rotation equivariance is easier and better-conditioned than the main task.
Non-equivariant models can achieve lower test loss per GPU-hour unless the efficiency gap is addressed.
Abstract
While data augmentation is widely used to train symmetry-agnostic models, it remains unclear how quickly and effectively they learn to respect symmetries. We investigate this by deriving a principled measure of equivariance error that, for convex losses, calculates the percent of total loss attributable to imperfections in learned symmetry. We focus our empirical investigation to 3D-rotation equivariance on high-dimensional molecular tasks (flow matching, force field prediction, denoising voxels) and find that models reduce equivariance error quickly to 2\% held-out loss within 1k-10k training steps, a result robust to model and dataset size. This happens because learning 3D-rotational equivariance is an easier learning task, with a smoother and better-conditioned loss landscape, than the main prediction task. For 3D rotations, the loss penalty for non-equivariant models is small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
