Approximate Leave-one-out Cross Validation for Regression with $\ell_1$ Regularizers (extended version)
Arnab Auddy, Haolin Zou, Kamiar Rahnama Rad, Arian Maleki

TL;DR
This paper develops a new theoretical framework for approximate leave-one-out cross validation (ALO) in generalized linear models with non-differentiable regularizers, especially focusing on l1-regularization, and demonstrates its accuracy as the number of features grows.
Contribution
It introduces a novel theory bounding the error between ALO and LO for models with non-differentiable regularizers, extending previous results to a broader class of problems.
Findings
|ALO - LO| converges to zero as p increases with fixed n/p and SNR
Theoretical bounds relate error to perturbations in active sets and sample size
Applicable to l1-regularized problems in high-dimensional settings
Abstract
The out-of-sample error (OO) is the main quantity of interest in risk estimation and model selection. Leave-one-out cross validation (LO) offers a (nearly) distribution-free yet computationally demanding approach to estimate OO. Recent theoretical work showed that approximate leave-one-out cross validation (ALO) is a computationally efficient and statistically reliable estimate of LO (and OO) for generalized linear models with differentiable regularizers. For problems involving non-differentiable regularizers, despite significant empirical evidence, the theoretical understanding of ALO's error remains unknown. In this paper, we present a novel theory for a wide class of problems in the generalized linear model family with non-differentiable regularizers. We bound the error |ALO - LO| in terms of intuitive metrics such as the size of leave-i-out perturbations in active sets, sample size…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Fault Detection and Control Systems · Control Systems and Identification
