On sure early selection of the best subset
Ziwei Zhu, Shihao Wu

TL;DR
This paper establishes conditions under which best subset selection (BSS) reliably identifies true signals early in high-dimensional linear models, providing statistical guarantees and a computationally efficient screening strategy.
Contribution
It introduces a robust signal margin condition for sure early selection in BSS, proposes a 'screen then select' method to address computational challenges, and demonstrates lower false discovery rates compared to other methods.
Findings
BSS can achieve sure early selection under a specific signal margin condition.
The proposed STS method effectively reduces dimension and maintains low FDR.
Early paths of greedy algorithms like hard thresholding are comparable to BSS in FDR.
Abstract
The early solution path, which tracks the first few variables that enter the model of a selection procedure, is of profound importance to scientific discoveries. In practice, it is often statistically hopeless to identify all the important features with no false discovery, let alone the intimidating expense of experiments to test their significance. Such realistic limitation calls for statistical guarantee for the early discoveries of a model selector. In this paper, we focus on the early solution path of best subset selection (BSS), where the sparsity constraint is set to be lower than {the true sparsity}. Under a sparse high-dimensional linear model, we establish the sufficient and (near) necessary condition for BSS to achieve sure early selection, or equivalently, zero false discovery throughout its early path. Essentially, this condition boils down to a lower bound of the minimum…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Probabilistic and Robust Engineering Design
