Towards a Statistical Understanding of Neural Networks: Beyond the Neural Tangent Kernel Theories
Haobo Zhang, Jianfa Lai, Yicheng Li, Qian Lin, Jun S. Liu

TL;DR
This paper explores the limitations of existing fixed kernel theories like NTK in explaining neural network feature learning and proposes a new model to better understand their adaptive feature learning capabilities.
Contribution
It introduces a Gaussian sequence model as a prototype to analyze neural networks' feature learning beyond fixed kernel theories.
Findings
Highlights limitations of NTK in capturing feature learning.
Proposes a new probabilistic model for neural network analysis.
Provides insights into the generalization benefits of feature learning.
Abstract
A primary advantage of neural networks lies in their feature learning characteristics, which is challenging to theoretically analyze due to the complexity of their training dynamics. We propose a new paradigm for studying feature learning and the resulting benefits in generalizability. After reviewing the neural tangent kernel (NTK) theory and recent results in kernel regression, which address the generalization issue of sufficiently wide neural networks, we examine limitations and implications of the fixed kernel theory (as the NTK theory) and review recent theoretical advancements in feature learning. Moving beyond the fixed kernel/feature theory, we consider neural networks as adaptive feature models. Finally, we propose an over-parameterized Gaussian sequence model as a prototype model to study the feature learning characteristics of neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsNeural Tangent Kernel
