To understand deep learning we need to understand kernel learning
Mikhail Belkin, Siyuan Ma, Soumik Mandal

TL;DR
This paper investigates why over-parameterized models like deep neural networks generalize well despite fitting noisy data, highlighting the role of kernel properties and challenging existing theoretical bounds.
Contribution
It demonstrates experimentally that kernel machines can generalize well in overfitted regimes and provides theoretical bounds showing the limitations of current understanding.
Findings
Kernel machines perform well on test data even with noisy labels.
Fitting noisy data with Laplacian kernels requires many epochs, unlike Gaussian kernels.
Generalization is more influenced by kernel properties than optimization processes.
Abstract
Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this "overfitting", they perform well on test data, a phenomenon not yet fully understood. The first point of our paper is that strong performance of overfitted classifiers is not a unique feature of deep learning. Using six real-world and two synthetic datasets, we establish experimentally that kernel machines trained to have zero classification or near zero regression error perform very well on test data, even when the labels are corrupted with a high level of noise. We proceed to give a lower bound on the norm of zero loss solutions for smooth kernels, showing that they increase nearly exponentially with data size. We point out that this is difficult to reconcile with the existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Gaussian Processes and Bayesian Inference · Sparse and Compressive Sensing Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia?
