A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Namjoon Suh, Guang Cheng

TL;DR
This survey comprehensively reviews the statistical theories of neural networks, covering approximation capabilities, training dynamics, and generative models, highlighting recent advances and theoretical insights in deep learning.
Contribution
It provides an integrated overview of the latest theoretical developments in neural network approximation, training dynamics, and generative models, connecting diverse perspectives and recent progress.
Findings
Neural networks can achieve fast convergence rates in nonparametric regression.
Training dynamics are explained through NTK and Mean-Field paradigms.
Recent advances include theoretical understanding of GANs, diffusion models, and LLMs.
Abstract
In this article, we review the literature on statistical theories of neural networks from three perspectives: approximation, training dynamics and generative models. In the first part, results on excess risks for neural networks are reviewed in the nonparametric framework of regression (and classification in Appendix~{\color{blue}B}). These results rely on explicit constructions of neural networks, leading to fast convergence rates of excess risks. Nonetheless, their underlying analysis only applies to the global minimizer in the highly non-convex landscape of deep neural networks. This motivates us to review the training dynamics of neural networks in the second part. Specifically, we review papers that attempt to answer ``how the neural network trained via gradient-based methods finds the solution that can generalize well on unseen data.'' In particular, two well-known paradigms are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Physics and Python Applications · Neural Networks and Applications
MethodsDiffusion
