Optimal Rate of Kernel Regression in Large Dimensions

Weihao Lu; Haobo Zhang; Yicheng Li; Manyun Xu; Qian Lin

arXiv:2309.04268·stat.ML·July 1, 2024·1 cites

Optimal Rate of Kernel Regression in Large Dimensions

Weihao Lu, Haobo Zhang, Yicheng Li, Manyun Xu, Qian Lin

PDF

Open Access

TL;DR

This paper analyzes the optimal convergence rates of kernel regression in high-dimensional settings, revealing new phenomena like multiple descent and periodic plateau behaviors, with implications for neural networks and the neural tangent kernel.

Contribution

It introduces a general framework to characterize minimax bounds for kernel regression in large dimensions and identifies the precise rates and phenomena across different sample size regimes.

Findings

01

Minimax rate of $n^{-1/2}$ for $ ext{γ} = 2, 4, 6, 8, ext{etc.}$

02

Discovery of multiple descent and periodic plateau behaviors in the optimal rate curve

03

Explicit description of the optimal rate curve for the neural tangent kernel (NTK)

Abstract

We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n ≍ d^{γ}$ for some $γ > 0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $ε_{n}^{2}$ and the metric entropy $\overset{ε}{ˉ}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $S^{d}$ , we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{- 1/2}$ when $n ≍ d^{γ}$ for $γ = 2, 4, 6, 8, \dots$ . We then further determine the optimal rate of the excess risk of kernel regression for all the $γ > 0$ and find that the curve of optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Mathematical Approximation and Integration