A Fast Sampling Gradient Tree Boosting Framework
Daniel Chao Zhou, Zhongming Jin, Tong Zhang

TL;DR
This paper introduces a fast gradient tree boosting framework that combines importance sampling and regularization to significantly accelerate training while maintaining accuracy.
Contribution
It proposes a novel importance sampling method and regularizer for gradient boosting, achieving linear convergence and substantial speedups over existing algorithms.
Findings
Achieves 2.5x to 18x acceleration on LogitBoost and LambdaMART
Maintains comparable performance with faster training times
Theoretical analysis confirms linear convergence rate
Abstract
As an adaptive, interpretable, robust, and accurate meta-algorithm for arbitrary differentiable loss functions, gradient tree boosting is one of the most popular machine learning techniques, though the computational expensiveness severely limits its usage. Stochastic gradient boosting could be adopted to accelerates gradient boosting by uniformly sampling training instances, but its estimator could introduce a high variance. This situation arises motivation for us to optimize gradient tree boosting. We combine gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance. Furthermore, we use a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. The theoretical analysis supports that our strategies achieve a linear convergence rate on logistic loss. Empirical results show that our algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Machine Learning and Data Classification
