Asymptotically free sketched ridge ensembles: Risks, cross-validation, and tuning
Pratik Patil, Daniel LeJeune

TL;DR
This paper uses random matrix theory to prove the consistency of GCV for tuning sketched ridge regression ensembles, enabling efficient risk estimation, optimal sketch size tuning, and accurate prediction intervals in large-scale settings.
Contribution
It introduces a theoretical framework for GCV in sketched ridge ensembles, extending risk estimation and tuning methods to a broad class of sketches and risk functionals.
Findings
GCV is consistent for risk estimation in sketched ridge ensembles.
Risk can be optimized by tuning only the sketch size in infinite ensembles.
Empirical validation confirms theoretical results on synthetic and real datasets.
Abstract
We employ random matrix theory to establish consistency of generalized cross validation (GCV) for estimating prediction risks of sketched ridge regression ensembles, enabling efficient and consistent tuning of regularization and sketching parameters. Our results hold for a broad class of asymptotically free sketches under very mild data assumptions. For squared prediction risk, we provide a decomposition into an unsketched equivalent implicit ridge bias and a sketching-based variance, and prove that the risk can be globally optimized by only tuning sketch size in infinite ensembles. For general subquadratic prediction risk functionals, we extend GCV to construct consistent risk estimators, and thereby obtain distributional convergence of the GCV-corrected predictions in Wasserstein-2 metric. This in particular allows construction of prediction intervals with asymptotically correct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Face and Expression Recognition · Stochastic Gradient Optimization Techniques
