Recent advances in deep learning theory
Fengxiang He, Dacheng Tao

TL;DR
This paper reviews recent theoretical advances in deep learning, categorizing literature into six key areas including generalization, optimization, geometry, over-parameterization, architecture, and ethics, to better organize the field's foundational understanding.
Contribution
It provides a comprehensive organization and synthesis of recent deep learning theory literature across six distinct thematic categories.
Findings
Analysis of generalization through complexity and capacity measures
Modeling stochastic gradient descent with differential equations
Insights into the geometry of loss landscapes and over-parameterization effects
Abstract
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference
