The geometry of the deep linear network

Govind Menon

arXiv:2411.09004·cs.NE·November 15, 2024

The geometry of the deep linear network

Govind Menon

PDF

Open Access

TL;DR

This paper explores the training dynamics of deep linear networks through geometric and thermodynamic frameworks, unifying existing results and linking to broader mathematical areas.

Contribution

It provides a unified geometric and thermodynamic analysis of deep linear networks, including invariant manifolds, Riemannian geometry, and stochastic gradient descent formulations.

Findings

01

Characterization of invariant manifolds in DLNs

02

Formulas for Boltzmann entropy and free energy

03

Connections between DLNs and other mathematical fields

Abstract

This article provides an expository account of training dynamics in the Deep Linear Network (DLN) from the perspective of the geometric theory of dynamical systems. Rigorous results by several authors are unified into a thermodynamic framework for deep learning. The analysis begins with a characterization of the invariant manifolds and Riemannian geometry in the DLN. This is followed by exact formulas for a Boltzmann entropy, as well as stochastic gradient descent of free energy using a Riemannian Langevin Equation. Several links between the DLN and other areas of mathematics are discussed, along with some open questions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition