ROTI-GCV: Generalized Cross-Validation for right-ROTationally Invariant Data
Kevin Luo, Yufan Li, Pragya Sur

TL;DR
This paper introduces ROTI-GCV, a new framework for reliable cross-validation in high-dimensional settings with dependent or heavy-tailed data, improving model tuning and risk estimation accuracy.
Contribution
We develop ROTI-GCV, a novel cross-validation method tailored for right-rotationally invariant data, addressing limitations of existing methods in dependent and heavy-tailed high-dimensional data.
Findings
ROTI-GCV accurately estimates out-of-sample risk in synthetic data.
The method outperforms traditional cross-validation in dependent data scenarios.
New estimators for signal-to-noise ratio and noise variance are effective.
Abstract
Two key tasks in high-dimensional regularized regression are tuning the regularization strength for accurate predictions and estimating the out-of-sample risk. It is known that the standard approach -- -fold cross-validation -- is inconsistent in modern high-dimensional settings. While leave-one-out and generalized cross-validation remain consistent in some high-dimensional cases, they become inconsistent when samples are dependent or contain heavy-tailed covariates. As a first step towards modeling structured sample dependence and heavy tails, we use right-rotationally invariant covariate distributions -- a crucial concept from compressed sensing. In the proportional asymptotics regime where the number of features and samples grow comparably, which is known to better reflect the empirical behavior in moderately sized datasets, we introduce a new framework, ROTI-GCV, for reliably…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
