A Fast Sampling Gradient Tree Boosting Framework

Daniel Chao Zhou; Zhongming Jin; Tong Zhang

arXiv:1911.08820·cs.LG·November 21, 2019

A Fast Sampling Gradient Tree Boosting Framework

Daniel Chao Zhou, Zhongming Jin, Tong Zhang

PDF

Open Access

TL;DR

This paper introduces a fast gradient tree boosting framework that combines importance sampling and regularization to significantly accelerate training while maintaining accuracy.

Contribution

It proposes a novel importance sampling method and regularizer for gradient boosting, achieving linear convergence and substantial speedups over existing algorithms.

Findings

01

Achieves 2.5x to 18x acceleration on LogitBoost and LambdaMART

02

Maintains comparable performance with faster training times

03

Theoretical analysis confirms linear convergence rate

Abstract

As an adaptive, interpretable, robust, and accurate meta-algorithm for arbitrary differentiable loss functions, gradient tree boosting is one of the most popular machine learning techniques, though the computational expensiveness severely limits its usage. Stochastic gradient boosting could be adopted to accelerates gradient boosting by uniformly sampling training instances, but its estimator could introduce a high variance. This situation arises motivation for us to optimize gradient tree boosting. We combine gradient tree boosting with importance sampling, which achieves better performance by reducing the stochastic variance. Furthermore, we use a regularizer to improve the diagonal approximation in the Newton step of gradient boosting. The theoretical analysis supports that our strategies achieve a linear convergence rate on logistic loss. Empirical results show that our algorithm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Advanced Image and Video Retrieval Techniques · Machine Learning and Data Classification