Extended Comparisons of Best Subset Selection, Forward Stepwise Selection, and the Lasso
Trevor Hastie, Robert Tibshirani, Ryan J. Tibshirani

TL;DR
This paper compares best subset selection, forward stepwise, and the lasso in regression, revealing that their relative performance depends on signal-to-noise ratio, with the relaxed lasso often performing best overall.
Contribution
It provides an expanded simulation study comparing these methods, highlighting their relative strengths across different SNR regimes and clarifying their comparative performance.
Findings
Best subset selection outperforms in high SNR regimes.
Lasso performs better in low SNR regimes.
Relaxed lasso often performs best overall.
Abstract
In exciting new work, Bertsimas et al. (2016) showed that the classical best subset selection problem in regression modeling can be formulated as a mixed integer optimization (MIO) problem. Using recent advances in MIO algorithms, they demonstrated that best subset selection can now be solved at much larger problem sizes that what was thought possible in the statistics community. They presented empirical comparisons of best subset selection with other popular variable selection procedures, in particular, the lasso and forward stepwise selection. Surprisingly (to us), their simulations suggested that best subset selection consistently outperformed both methods in terms of prediction accuracy. Here we present an expanded set of simulations to shed more light on these comparisons. The summary is roughly as follows: (a) neither best subset selection nor the lasso uniformly dominate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Advanced Statistical Process Monitoring
