Keep the Momentum: Conservation Laws beyond Euclidean Gradient Flows
Sibylle Marcotte, R\'emi Gribonval, Gabriel Peyr\'e

TL;DR
This paper explores conservation laws in non-Euclidean and momentum-based neural network dynamics, revealing temporal dependence and loss of conservation laws compared to Euclidean gradient flows, especially in ReLU and non-Euclidean settings.
Contribution
It characterizes all conservation laws in general non-Euclidean and momentum dynamics, highlighting differences from gradient flows and identifying conditions where conservation laws are lost.
Findings
Conservation laws in momentum dynamics depend on time, unlike in gradient flows.
Fewer conservation laws exist in momentum dynamics for linear networks, especially in over-parameterized regimes.
No conservation laws remain for ReLU networks or in certain non-Euclidean metrics.
Abstract
Conservation laws are well-established in the context of Euclidean gradient flow dynamics, notably for linear or ReLU neural network training. Yet, their existence and principles for non-Euclidean geometries and momentum-based dynamics remain largely unknown. In this paper, we characterize "all" conservation laws in this general setting. In stark contrast to the case of gradient flows, we prove that the conservation laws for momentum-based dynamics exhibit temporal dependence. Additionally, we often observe a "conservation loss" when transitioning from gradient flow to momentum dynamics. Specifically, for linear networks, our framework allows us to identify all momentum conservation laws, which are less numerous than in the gradient flow case except in sufficiently over-parameterized regimes. With ReLU networks, no conservation law remains. This phenomenon also manifests in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neuroimaging Techniques and Applications · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques
