Fast Computation of Leave-One-Out Cross-Validation for $k$-NN Regression

Motonobu Kanagawa

arXiv:2405.04919·stat.ML·December 5, 2024

Fast Computation of Leave-One-Out Cross-Validation for $k$-NN Regression

Motonobu Kanagawa

PDF

Open Access

TL;DR

This paper introduces a computationally efficient method for leave-one-out cross-validation in $k$-NN regression, reducing the need for multiple model fits by leveraging a relationship with $(k+1)$-NN regression.

Contribution

The authors establish a novel theoretical link that allows LOOCV for $k$-NN to be computed using a single $(k+1)$-NN regression fit, significantly speeding up the process.

Findings

01

The method is validated through numerical experiments.

02

LOOCV score can be computed from a single $(k+1)$-NN fit.

03

The approach reduces computational complexity for $k$-NN regression.

Abstract

We describe a fast computation method for leave-one-out cross-validation (LOOCV) for $k$ -nearest neighbours ( $k$ -NN) regression. We show that, under a tie-breaking condition for nearest neighbours, the LOOCV estimate of the mean square error for $k$ -NN regression is identical to the mean square error of $(k + 1)$ -NN regression evaluated on the training data, multiplied by the scaling factor $(k + 1)^{2} / k^{2}$ . Therefore, to compute the LOOCV score, one only needs to fit $(k + 1)$ -NN regression only once, and does not need to repeat training-validation of $k$ -NN regression for the number of training data. Numerical experiments confirm the validity of the fast computation method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Statistical Methods and Bayesian Inference