agtboost: Adaptive and Automatic Gradient Tree Boosting Computations

Berent {\AA}nund Str{\o}mnes Lunde; Tore Selland Kleppe

arXiv:2008.12625·stat.ML·August 31, 2020

agtboost: Adaptive and Automatic Gradient Tree Boosting Computations

Berent {\AA}nund Str{\o}mnes Lunde, Tore Selland Kleppe

PDF

Open Access 1 Repo

TL;DR

The agtboost R package offers a faster, more automated gradient tree boosting implementation that adapts to data complexity, reduces user effort, and includes advanced validation and feature importance tools.

Contribution

It introduces an automatic, adaptive gradient boosting method that simplifies model tuning and enhances computational efficiency compared to existing frameworks.

Findings

01

Significantly decreases computation time.

02

Automatically determines the number of trees.

03

Includes advanced feature importance and validation functions.

Abstract

agtboost is an R package implementing fast gradient tree boosting computations in a manner similar to other established frameworks such as xgboost and LightGBM, but with significant decreases in computation time and required mathematical and technical knowledge. The package automatically takes care of split/no-split decisions and selects the number of trees in the gradient tree boosting ensemble, i.e., agtboost adapts the complexity of the ensemble automatically to the information in the data. All of this is done during a single training run, which is made possible by utilizing developments in information theory for tree algorithms {\tt arXiv:2008.05926v1 [stat.ME]}. agtboost also comes with a feature importance function that eliminates the common practice of inserting noise features. Further, a useful model validation function performs the Kolmogorov-Smirnov test on the learned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Blunde1/aGTBoost
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Machine Learning and Data Classification · Bayesian Methods and Mixture Models