Abide by the Law and Follow the Flow: Conservation Laws for Gradient Flows
Sibylle Marcotte, R\'emi Gribonval, Gabriel Peyr\'e

TL;DR
This paper introduces the concept of conservation laws in gradient flows, providing methods to identify and compute these laws, which help explain the implicit bias and properties of over-parameterized models during training.
Contribution
It rigorously defines conservation laws in gradient descent, develops algorithms to find these laws, and demonstrates their application to neural network architectures.
Findings
Conservation laws are fundamental in understanding gradient flow dynamics.
Algorithms successfully recover known laws in ReLU networks.
No additional independent laws are found beyond those identified by the algorithms.
Abstract
Understanding the geometric properties of gradient descent dynamics is a key ingredient in deciphering the recent success of very large machine learning models. A striking observation is that trained over-parameterized models retain some properties of the optimization initialization. This "implicit bias" is believed to be responsible for some favorable properties of the trained models and could explain their good generalization properties. The purpose of this article is threefold. First, we rigorously expose the definition and basic properties of "conservation laws", that define quantities conserved during gradient flows of a given model (e.g. of a ReLU network with a given architecture) with any training data and any loss. Then we explain how to find the maximal number of independent conservation laws by performing finite-dimensional algebraic manipulations on the Lie algebra generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques
