# Spectrally-truncated kernel ridge regression and its free lunch

**Authors:** Arash A. Amini

arXiv: 1906.06276 · 2019-10-15

## TL;DR

This paper analyzes spectrally-truncated kernel ridge regression, revealing that truncation can outperform full KRR in minimax risk for infinite-dimensional RKHS, and explores the trade-offs between spectral truncation and regularization.

## Contribution

It provides an exact risk expression for truncated KRR and demonstrates that spectral truncation can improve performance beyond full KRR in certain regimes.

## Key findings

- Spectral truncation can outperform full KRR in minimax risk.
- There exists a threshold on the number of eigenvalues retained for improved performance.
- Implicit regularization from truncation complements Hilbert norm regularization.

## Abstract

Kernel ridge regression (KRR) is a well-known and popular nonparametric regression approach with many desirable properties, including minimax rate-optimality in estimating functions that belong to common reproducing kernel Hilbert spaces (RKHS). The approach, however, is computationally intensive for large data sets, due to the need to operate on a dense $n \times n$ kernel matrix, where $n$ is the sample size. Recently, various approximation schemes for solving KRR have been considered, and some analyzed. Some approaches such as Nystr\"{o}m approximation and sketching have been shown to preserve the rate optimality of KRR. In this paper, we consider the simplest approximation, namely, spectrally truncating the kernel matrix to its largest $r < n$ eigenvalues. We derive an exact expression for the maximum risk of this truncated KRR, over the unit ball of the RKHS. This result can be used to study the exact trade-off between the level of spectral truncation and the regularization parameter. We show that, as long as the RKHS is infinite-dimensional, there is a threshold on $r$, above which, the spectrally-truncated KRR surprisingly outperforms the full KRR in terms of the minimax risk, where the minimum is taken over the regularization parameter. This strengthens the existing results on approximation schemes, by showing that not only one does not lose in terms of the rates, truncation can in fact improve the performance, for all finite samples (above the threshold). Moreover, we show that the implicit regularization achieved by spectral truncation is not a substitute for Hilbert norm regularization. Both are needed to achieve the best performance.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.06276/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1906.06276/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/1906.06276/full.md

---
Source: https://tomesphere.com/paper/1906.06276