Lecture Notes on Linear Neural Networks: A Tale of Optimization and   Generalization in Deep Learning

Nadav Cohen; Noam Razin

arXiv:2408.13767·cs.LG·November 7, 2024

Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning

Nadav Cohen, Noam Razin

PDF

Open Access

TL;DR

This paper presents a mathematical theory of linear neural networks focusing on optimization and generalization, using dynamical tools to deepen understanding and explore practical implications in deep learning.

Contribution

It introduces a novel dynamical systems approach to analyze linear neural networks, advancing theoretical understanding of their optimization and generalization properties.

Findings

01

The theory provides insights into the training dynamics of linear neural networks.

02

Practical applications derived from the theory demonstrate its relevance.

03

The approach enhances the mathematical understanding of deep learning models.

Abstract

These notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning. They present a theory (developed by NC, NR and collaborators) of linear neural networks -- a fundamental model in the study of optimization and generalization in deep learning. Practical applications born from the presented theory are also discussed. The theory is based on mathematical tools that are dynamical in nature. It showcases the potential of such tools to push the envelope of our understanding of optimization and generalization in deep learning. The text assumes familiarity with the basics of statistical learning theory. Exercises (without solutions) are included.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications