Early stopping for $ L^2 $-boosting in high-dimensional linear models
Bernhard Stankewitz

TL;DR
This paper proposes a data-driven early stopping method for $L^2$-boosting in high-dimensional linear models, achieving statistical optimality with significantly reduced computational cost compared to full path methods.
Contribution
It introduces a sequential early stopping rule for $L^2$-boosting that maintains statistical guarantees while being computationally more efficient than existing full-path criteria.
Findings
Early stopping preserves statistical optimality.
Method performs comparably to state-of-the-art algorithms.
Significant reduction in computational cost.
Abstract
Increasingly high-dimensional data sets require that estimation methods do not only satisfy statistical guarantees but also remain computationally feasible. In this context, we consider -boosting via orthogonal matching pursuit in a high-dimensional linear model and analyze a data-driven early stopping time of the algorithm, which is sequential in the sense that its computation is based on the first iterations only. This approach is much less costly than established model selection criteria, that require the computation of the full boosting path. We prove that sequential early stopping preserves statistical optimality in this setting in terms of a fully general oracle inequality for the empirical risk and recently established optimal convergence rates for the population risk. Finally, an extensive simulation study shows that at an immensely reduced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
