Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Francesco Locatello,, Volkan Cevher

TL;DR
This paper demonstrates that over-parameterized deep neural networks trained with lazy training can achieve near-perfect training accuracy and Bayes-optimal test error in well-separated data distributions, highlighting benign overfitting.
Contribution
It unifies concepts of overparameterization, benign overfitting, and Lipschitz continuity, providing theoretical insights into generalization in deep neural networks under lazy training regimes.
Findings
DNNs can interpolate data and still generalize well in well-separated distributions.
Smoother interpolating functions lead to better generalization.
Generalization error under NTK regime depends only on label and initialization noise.
Abstract
This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU activation functions and proves that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification while obtaining (nearly) zero-training error under the lazy training regime. For this purpose, we unify three interrelated concepts of overparameterization, benign overfitting, and the Lipschitz constant of DNNs. Our results indicate that interpolating with smoother functions leads to better generalization. Furthermore, we investigate the special case where interpolating smooth ground-truth functions is performed by DNNs under the Neural Tangent Kernel (NTK) regime for generalization. Our result demonstrates that the generalization error converges to a constant order that only depends on label noise and initialization noise, which theoretically verifies benign…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Advanced Neural Network Applications
MethodsTest · Neural Tangent Kernel
