A scalable estimate of the extra-sample prediction error via approximate   leave-one-out

Kamiar Rahnama Rad; Arian Maleki

arXiv:1801.10243·stat.ME·February 12, 2020·21 cites

A scalable estimate of the extra-sample prediction error via approximate leave-one-out

Kamiar Rahnama Rad, Arian Maleki

PDF

Open Access 2 Repos

TL;DR

This paper introduces a computationally efficient approximate leave-one-out (ALO) formula for high-dimensional regularized estimators, providing accurate out-of-sample risk estimates with theoretical guarantees and practical validation.

Contribution

It presents a novel closed-form ALO method that approximates leave-one-out cross-validation in high-dimensional settings without requiring sparsity assumptions.

Findings

01

ALO closely approximates LO with decreasing error as n,p grow.

02

Theoretical bounds show the error tends to zero with high probability in large samples.

03

Numerical experiments confirm the method's excellent finite-sample performance.

Abstract

The paper considers the problem of out-of-sample risk estimation under the high dimensional settings where standard techniques such as $K$ -fold cross validation suffer from large biases. Motivated by the low bias of the leave-one-out cross validation (LO) method, we propose a computationally efficient closed-form approximate leave-one-out formula (ALO) for a large class of regularized estimators. Given the regularized estimate, calculating ALO requires minor computational overhead. With minor assumptions about the data generating process, we obtain a finite-sample upper bound for $∣ LO - ALO ∣$ . Our theoretical analysis illustrates that $∣ LO - ALO ∣ \to 0$ with overwhelming probability, when $n, p \to \infty$ , where the dimension $p$ of the feature vectors may be comparable with or even greater than the number of observations, $n$ . Despite the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Probabilistic and Robust Engineering Design