Precise Learning Curves and Higher-Order Scaling Limits for Dot Product   Kernel Regression

Lechao Xiao; Hong Hu; Theodor Misiakiewicz; Yue M. Lu; Jeffrey; Pennington

arXiv:2205.14846·cs.LG·June 13, 2023

Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

Lechao Xiao, Hong Hu, Theodor Misiakiewicz, Yue M. Lu, Jeffrey, Pennington

PDF

Open Access

TL;DR

This paper derives precise formulas for the learning curves of dot-product kernel regression across higher-order data and model scaling regimes, revealing complex behaviors and peaks in prediction error as data scales with dimension.

Contribution

It introduces exact formulas for test error, bias, and variance in higher-order asymptotics for kernel ridge regression, extending understanding beyond classical regimes.

Findings

01

Learning curves exhibit peaks at specific data scales, notably when m ≈ d^r/r!

02

Multiple descent phenomena occur at various scales in the higher-order asymptotics

03

The results unify and extend existing asymptotic theories for kernel regression.

Abstract

As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either large-sample asymptotics ( $m \to \infty$ ) or, for certain simple data distributions, to the high-dimensional asymptotics in which the number of samples scales linearly with the dimension ( $m \propto d$ ). There is a wide gulf between these two regimes, including all higher-order scaling relations $m \propto d^{r}$ , which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the mean of the test error, bias, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Neural Networks and Applications · Bayesian Methods and Mixture Models