Generating Compact Tree Ensembles via Annealing

Gitesh Dawer; Yangzi Guo; Adrian Barbu

arXiv:1709.05545·stat.ML·February 21, 2020

Generating Compact Tree Ensembles via Annealing

Gitesh Dawer, Yangzi Guo, Adrian Barbu

PDF

TL;DR

This paper introduces a novel annealing-based method to generate compact, interpretable tree ensembles by growing many trees in parallel, then selecting and optimizing a subset for improved performance.

Contribution

It proposes a new approach combining parallel tree growth and selective optimization to produce smaller, more accurate tree ensembles compared to traditional boosting or Random Forest methods.

Findings

01

Models have smaller loss than boosting

02

Achieves lower misclassification error

03

Enables flexible tree depth for better generalization

Abstract

Tree ensembles are flexible predictive models that can capture relevant variables and to some extent their interactions in a compact and interpretable manner. Most algorithms for obtaining tree ensembles are based on versions of boosting or Random Forest. Previous work showed that boosting algorithms exhibit a cyclic behavior of selecting the same tree again and again due to the way the loss is optimized. At the same time, Random Forest is not based on loss optimization and obtains a more complex and less interpretable model. In this paper we present a novel method for obtaining compact tree ensembles by growing a large pool of trees in parallel with many independent boosting threads and then selecting a small subset and updating their leaf weights by loss optimization. We allow for the trees in the initial pool to have different depths which further helps with generalization.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.