Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime

Hong Hu; Yue M. Lu

arXiv:2205.06798·cs.LG·May 16, 2022·6 cites

Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime

Hong Hu, Yue M. Lu

PDF

Open Access

TL;DR

This paper provides a detailed asymptotic analysis of kernel ridge regression's generalization performance across different sample size and dimension regimes, revealing complex behaviors like double descent.

Contribution

It offers the first sharp asymptotic characterization of KRR at critical transition regions beyond the linear regime, clarifying the impact of parameters on learning dynamics.

Findings

01

KRR exhibits multi-phased learning with sharp transitions at n ~ d^k.

02

The analysis uncovers double descent phenomena in KRR learning curves.

03

Parameters like kernel choice significantly influence generalization performance.

Abstract

The generalization performance of kernel ridge regression (KRR) exhibits a multi-phased pattern that crucially depends on the scaling relationship between the sample size $n$ and the underlying dimension $d$ . This phenomenon is due to the fact that KRR sequentially learns functions of increasing complexity as the sample size increases; when $d^{k - 1} ≪ n ≪ d^{k}$ , only polynomials with degree less than $k$ are learned. In this paper, we present sharp asymptotic characterization of the performance of KRR at the critical transition regions with $n ≍ d^{k}$ , for $k \in Z^{+}$ . Our asymptotic characterization provides a precise picture of the whole learning process and clarifies the impact of various parameters (including the choice of the kernel function) on the generalization performance. In particular, we show that the learning curves of KRR can have a delicate "double…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Machine Learning and ELM