Lecture Notes on Linear Neural Networks: A Tale of Optimization and Generalization in Deep Learning
Nadav Cohen, Noam Razin

TL;DR
This paper presents a mathematical theory of linear neural networks focusing on optimization and generalization, using dynamical tools to deepen understanding and explore practical implications in deep learning.
Contribution
It introduces a novel dynamical systems approach to analyze linear neural networks, advancing theoretical understanding of their optimization and generalization properties.
Findings
The theory provides insights into the training dynamics of linear neural networks.
Practical applications derived from the theory demonstrate its relevance.
The approach enhances the mathematical understanding of deep learning models.
Abstract
These notes are based on a lecture delivered by NC on March 2021, as part of an advanced course in Princeton University on the mathematical understanding of deep learning. They present a theory (developed by NC, NR and collaborators) of linear neural networks -- a fundamental model in the study of optimization and generalization in deep learning. Practical applications born from the presented theory are also discussed. The theory is based on mathematical tools that are dynamical in nature. It showcases the potential of such tools to push the envelope of our understanding of optimization and generalization in deep learning. The text assumes familiarity with the basics of statistical learning theory. Exercises (without solutions) are included.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
