Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed
Maria Refinetti, Sebastian Goldt, Florent Krzakala, Lenka Zdeborov\'a

TL;DR
This paper demonstrates that small two-layer neural networks can outperform kernel methods on a Gaussian mixture classification task in high dimensions, highlighting the limitations of kernel methods and the advantages of neural networks.
Contribution
Theoretical analysis showing neural networks outperform kernel methods on a specific high-dimensional classification task, with a derivation of learning dynamics and performance metrics.
Findings
Small neural networks beat kernel methods on Gaussian mixture classification
Over-parameterization speeds up convergence but doesn't improve final accuracy
Kernel methods fail to match neural network performance in high-dimensional settings
Abstract
A recent series of theoretical works showed that the dynamics of neural networks with a certain initialisation are well-captured by kernel methods. Concurrent empirical work demonstrated that kernel methods can come close to the performance of neural networks on some image classification tasks. These results raise the question of whether neural networks only learn successfully if kernels also learn successfully, despite neural networks being more expressive. Here, we show theoretically that two-layer neural networks (2LNN) with only a few hidden neurons can beat the performance of kernel learning on a simple Gaussian mixture classification task. We study the high-dimensional limit where the number of samples is linearly proportional to the input dimension, and show that while small 2LNN achieve near-optimal performance on this task, lazy training approaches such as random features and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Machine Learning and ELM
