Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression
Lechao Xiao, Hong Hu, Theodor Misiakiewicz, Yue M. Lu, Jeffrey, Pennington

TL;DR
This paper derives precise formulas for the learning curves of dot-product kernel regression across higher-order data and model scaling regimes, revealing complex behaviors and peaks in prediction error as data scales with dimension.
Contribution
It introduces exact formulas for test error, bias, and variance in higher-order asymptotics for kernel ridge regression, extending understanding beyond classical regimes.
Findings
Learning curves exhibit peaks at specific data scales, notably when m ≈ d^r/r!
Multiple descent phenomena occur at various scales in the higher-order asymptotics
The results unify and extend existing asymptotic theories for kernel regression.
Abstract
As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either large-sample asymptotics () or, for certain simple data distributions, to the high-dimensional asymptotics in which the number of samples scales linearly with the dimension (). There is a wide gulf between these two regimes, including all higher-order scaling relations , which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the mean of the test error, bias, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Neural Networks and Applications · Bayesian Methods and Mixture Models
