Towards a Statistical Understanding of Neural Networks: Beyond the   Neural Tangent Kernel Theories

Haobo Zhang; Jianfa Lai; Yicheng Li; Qian Lin; Jun S. Liu

arXiv:2412.18756·cs.LG·December 30, 2024

Towards a Statistical Understanding of Neural Networks: Beyond the Neural Tangent Kernel Theories

Haobo Zhang, Jianfa Lai, Yicheng Li, Qian Lin, Jun S. Liu

PDF

Open Access

TL;DR

This paper explores the limitations of existing fixed kernel theories like NTK in explaining neural network feature learning and proposes a new model to better understand their adaptive feature learning capabilities.

Contribution

It introduces a Gaussian sequence model as a prototype to analyze neural networks' feature learning beyond fixed kernel theories.

Findings

01

Highlights limitations of NTK in capturing feature learning.

02

Proposes a new probabilistic model for neural network analysis.

03

Provides insights into the generalization benefits of feature learning.

Abstract

A primary advantage of neural networks lies in their feature learning characteristics, which is challenging to theoretically analyze due to the complexity of their training dynamics. We propose a new paradigm for studying feature learning and the resulting benefits in generalizability. After reviewing the neural tangent kernel (NTK) theory and recent results in kernel regression, which address the generalization issue of sufficiently wide neural networks, we examine limitations and implications of the fixed kernel theory (as the NTK theory) and review recent theoretical advancements in feature learning. Moving beyond the fixed kernel/feature theory, we consider neural networks as adaptive feature models. Finally, we propose an over-parameterized Gaussian sequence model as a prototype model to study the feature learning characteristics of neural networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsNeural Tangent Kernel