Asymptotic Theory of $K$-fold Cross-validation in Lasso and the validity of Bootstrap
Mayukh Choudhury, Debraj Das

TL;DR
This paper provides a theoretical foundation for the use of K-fold cross-validation in Lasso regression, demonstrating its consistency and validating bootstrap methods for inference.
Contribution
It establishes the asymptotic properties of K-fold CV Lasso and confirms bootstrap validity for statistical inference in heteroscedastic linear models.
Findings
K-fold CV Lasso is n^{1/2}-consistent under certain conditions.
Bootstrap approximates the distribution of K-fold CV Lasso estimator accurately.
Bootstrap-based inference is validated through simulations and real data.
Abstract
Least absolute shrinkage and selection operator or Lasso is one of the widely used regularization methods in regression. Statisticians usually implement Lasso in practice by choosing the penalty parameter in a data-dependent way, the most popular being the fold cross-validation (or fold CV). However, inferential properties, such as the variable selection consistency and consistency, of the fold CV based Lasso estimator and validity of the Bootstrap approximation are still unknown. In this paper, we consider the heteroscedastic linear regression model and show only under some moment type conditions that the Lasso estimator with -fold CV based penalty is consistent, but not variable selection consistent. Additionally, we establish the validity of Bootstrap in approximating the distribution of the fold CV based Lasso estimator. Therefore, our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
