Fast Statistical Leverage Score Approximation in Kernel Ridge Regression
Yifan Chen, Yun Yang

TL;DR
This paper introduces a fast, linear-time algorithm for accurately approximating statistical leverage scores in kernel ridge regression, significantly improving the efficiency of Nyström approximation without sacrificing prediction accuracy.
Contribution
It provides a novel analytic formula for leverage scores in stationary-kernel KRR, enabling efficient sampling for Nyström approximation with theoretical guarantees.
Findings
Method is orders of magnitude faster than existing approaches.
Maintains the same prediction accuracy as traditional methods.
Theoretical analysis confirms the accuracy of leverage score approximation.
Abstract
Nystr\"om approximation is a fast randomized method that rapidly solves kernel ridge regression (KRR) problems through sub-sampling the n-by-n empirical kernel matrix appearing in the objective function. However, the performance of such a sub-sampling method heavily relies on correctly estimating the statistical leverage scores for forming the sampling distribution, which can be as costly as solving the original KRR. In this work, we propose a linear time (modulo poly-log terms) algorithm to accurately approximate the statistical leverage scores in the stationary-kernel-based KRR with theoretical guarantees. Particularly, by analyzing the first-order condition of the KRR objective, we derive an analytic formula, which depends on both the input distribution and the spectral density of stationary kernels, for capturing the non-uniformity of the statistical leverage scores. Numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques
