Lassoed Tree Boosting
Alejandro Schuler, Yi Li, Mark van der Laan

TL;DR
This paper introduces a 'lassoed' gradient boosted tree algorithm with early stopping that achieves fast, dimension-independent convergence in nonparametric function spaces, supported by theory, simulations, and real data.
Contribution
It provides the first convergence proof for a Lasso-regularized gradient boosting method with early stopping in nonparametric settings.
Findings
Achieves faster than n^{-1/4} convergence rate.
Empirical results match theoretical predictions.
Scales well to large datasets.
Abstract
Gradient boosting performs exceptionally in most prediction problems and scales well to large datasets. In this paper we prove that a ``lassoed'' gradient boosted tree algorithm with early stopping achieves faster than L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation. This rate is remarkable because it does not depend on the dimension, sparsity, or smoothness. We use simulation and real data to confirm our theory and demonstrate empirical performance and scalability on par with standard boosting. Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques
