Least squares after model selection in high-dimensional sparse models

Alexandre Belloni; Victor Chernozhukov

arXiv:1001.0188·math.ST·March 21, 2013

Least squares after model selection in high-dimensional sparse models

Alexandre Belloni, Victor Chernozhukov

PDF

TL;DR

This paper demonstrates that post-model selection OLS estimators, especially after Lasso, can match or outperform Lasso in convergence rates, with advantages like reduced bias, even when model selection isn't perfect.

Contribution

It provides nonasymptotic rate analysis for OLS post-Lasso estimators, showing their robustness and potential improvements over Lasso, including a new sparsity bound and practical thresholding schemes.

Findings

01

Post-Lasso performs at least as well as Lasso in convergence rate.

02

OLS post-Lasso can outperform Lasso if the model is correctly selected.

03

The analysis applies to various thresholding methods and includes a new sparsity bound.

Abstract

In this article we study post-model selection estimators that apply ordinary least squares (OLS) to the model selected by first-step penalized estimators, typically Lasso. It is well known that Lasso can estimate the nonparametric regression function at nearly the oracle rate, and is thus hard to improve upon. We show that the OLS post-Lasso estimator performs at least as well as Lasso in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the Lasso-based model selection "fails" in the sense of missing some components of the "true" regression model. By the "true" model, we mean the best s-dimensional approximation to the nonparametric regression function chosen by the oracle. Furthermore, OLS post-Lasso estimator can perform strictly better than Lasso, in the sense of a strictly faster rate of convergence, if the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.