Regularization Implies balancedness in the deep linear network

Kathryn Lindsey; Govind Menon

arXiv:2511.01137·cs.LG·March 24, 2026

Regularization Implies balancedness in the deep linear network

Kathryn Lindsey, Govind Menon

PDF

Open Access

TL;DR

This paper uses geometric invariant theory to analyze how regularization in deep linear networks leads to balancedness, showing convergence properties and decomposing training dynamics into regularizing and learning flows.

Contribution

It introduces a geometric framework linking regularization and balancedness in deep linear networks, with explicit convergence results and a decomposition of training dynamics.

Findings

01

Regularization minimizes on the balanced manifold.

02

Balancing flows converge exponentially and globally.

03

Framework unifies deep learning and linear systems theory.

Abstract

We use geometric invariant theory (GIT) to study the deep linear network (DLN). The Kempf-Ness theorem is used to establish that the $L^{2}$ regularizer is minimized on the balanced manifold. We introduce related balancing flows using the Riemannian geometry of fibers. The balancing flow defined by the $L^{2}$ regularizer is shown to converge to the balanced manifold at a uniform exponential rate. The balancing flow defined by the squared moment map is computed explicitly and shown to converge globally. This framework allows us to decompose the training dynamics into two distinct gradient flows: a regularizing flow on fibers and a learning flow on the balanced manifold. It also provides a common mathematical framework for balancedness in deep learning and linear systems theory. We use this framework to interpret balancedness in terms of fast-slow systems, model reduction and Bayesian…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference