The sparsity and bias of the Lasso selection in high-dimensional linear regression
Cun-Hui Zhang, Jian Huang

TL;DR
This paper analyzes the sparsity and bias of the Lasso in high-dimensional linear regression, showing it can select models of correct size and control bias under certain conditions, even when the number of variables exceeds the sample size.
Contribution
It extends existing results by proving rate consistency of the Lasso in model selection with small nonzero coefficients and high-dimensional data under a sparse Riesz condition.
Findings
Lasso selects models of correct order of dimensionality.
Bias of the selected model is controlled by small coefficients and threshold bias.
Error measures converge at optimal rates under specified conditions.
Abstract
Meinshausen and Buhlmann [Ann. Statist. 34 (2006) 1436--1462] showed that, for neighborhood selection in Gaussian graphical models, under a neighborhood stability condition, the LASSO is consistent, even when the number of variables is of greater order than the sample size. Zhao and Yu [(2006) J. Machine Learning Research 7 2541--2567] formalized the neighborhood stability condition in the context of linear regression as a strong irrepresentable condition. That paper showed that under this condition, the LASSO selects exactly the set of nonzero regression coefficients, provided that these coefficients are bounded away from zero at a certain rate. In this paper, the regression coefficients outside an ideal model are assumed to be small, but not necessarily zero. Under a sparse Riesz condition on the correlation of design variables, we prove that the LASSO selects a model of the correct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
