Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective
Guan-Horng Liu, Evangelos A. Theodorou

TL;DR
This paper reviews how viewing deep neural networks as dynamical systems and optimization as control problems offers a unified theoretical framework for understanding deep learning's principles, convergence, and generalization.
Contribution
It introduces a novel perspective linking deep learning theory with dynamical systems and optimal control, providing insights into information flow, training dynamics, and hyper-parameter tuning.
Findings
Neural networks modeled as discrete-time nonlinear dynamical systems.
Optimization algorithms recast as controllers in an optimal control framework.
Framework applicable to various learning paradigms beyond supervised learning.
Abstract
Attempts from different disciplines to provide a fundamental understanding of deep learning have advanced rapidly in recent years, yet a unified framework remains relatively limited. In this article, we provide one possible way to align existing branches of deep learning theory through the lens of dynamical system and optimal control. By viewing deep neural networks as discrete-time nonlinear dynamical systems, we can analyze how information propagates through layers using mean field theory. When optimization algorithms are further recast as controllers, the ultimate goal of training processes can be formulated as an optimal control problem. In addition, we can reveal convergence and generalization properties by studying the stochastic dynamics of optimization algorithms. This viewpoint features a wide range of theoretical study from information bottleneck to statistical physics. It…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
