Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani, Yu Bai, Jason D. Lee

TL;DR
This paper demonstrates how two-layer neural networks can escape the NTK regime to efficiently learn functions combining dense low-degree and sparse high-degree polynomials, surpassing the capabilities of NTK and QuadNTK alone.
Contribution
It introduces a spectral analysis-based method to identify good parameter directions and a regularizer to enable neural networks to learn complex function classes beyond NTK limitations.
Findings
Neural networks can jointly learn dense low-degree and sparse high-degree polynomials.
The proposed regularizer guides gradient descent to low-error solutions.
The method improves sample complexity over NTK and QuadNTK approaches.
Abstract
A recent goal in the theory of deep learning is to identify how neural networks can escape the "lazy training," or Neural Tangent Kernel (NTK) regime, where the network is coupled with its first order Taylor expansion at initialization. While the NTK is minimax optimal for learning dense polynomials (Ghorbani et al, 2021), it cannot learn features, and hence has poor sample complexity for learning many classes of functions including sparse polynomials. Recent works have thus aimed to identify settings where gradient based algorithms provably generalize better than the NTK. One such example is the "QuadNTK" approach of Bai and Lee (2020), which analyzes the second-order term in the Taylor expansion. Bai and Lee (2020) show that the second-order term can learn sparse polynomials efficiently; however, it sacrifices the ability to learn general dense polynomials. In this paper, we analyze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Neural Networks and Applications
MethodsTest · Neural Tangent Kernel
