NRGBoost: Energy-Based Generative Boosted Trees
Jo\~ao Bravo

TL;DR
NRGBoost introduces an energy-based generative boosting algorithm that models data density explicitly, matching the discriminative performance of GBDT and offering competitive sampling capabilities, thus extending tree-based methods to generative tasks.
Contribution
The paper presents a novel energy-based generative boosting algorithm that extends traditional tree models to explicitly model data density for both discriminative and generative tasks.
Findings
Achieves similar discriminative performance to GBDT on real datasets.
Outperforms alternative generative approaches in experiments.
Competitive with neural networks for sampling tasks.
Abstract
Despite the rise to dominance of deep learning in unstructured data domains, tree-based methods such as Random Forests (RF) and Gradient Boosted Decision Trees (GBDT) are still the workhorses for handling discriminative tasks on tabular data. We explore generative extensions of these popular algorithms with a focus on explicitly modeling the data density (up to a normalization constant), thus enabling other applications besides sampling. As our main contribution we propose an energy-based generative boosting algorithm that is analogous to the second-order boosting implemented in popular libraries like XGBoost. We show that, despite producing a generative model capable of handling inference tasks over any input variable, our proposed algorithm can achieve similar discriminative performance to GBDT on a number of real world tabular datasets, outperforming alternative generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAlgorithms and Data Compression
MethodsFocus
