Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees
Jonathan Brophy, Daniel Lowd

TL;DR
The paper introduces IBUG, a simple non-parametric method to extend gradient-boosted regression trees with uncertainty estimates, achieving competitive probabilistic performance on benchmark datasets.
Contribution
IBUG provides a novel, flexible approach to uncertainty estimation for GBRTs using a tree-ensemble kernel and nearest neighbors, outperforming previous methods.
Findings
IBUG achieves similar or better performance than state-of-the-art methods.
It can improve probabilistic calibration with a scalar tuning factor.
IBUG models the posterior distribution more flexibly than competing methods.
Abstract
Gradient-boosted regression trees (GBRTs) are hugely popular for solving tabular regression problems, but provide no estimate of uncertainty. We propose Instance-Based Uncertainty estimation for Gradient-boosted regression trees (IBUG), a simple method for extending any GBRT point predictor to produce probabilistic predictions. IBUG computes a non-parametric distribution around a prediction using the -nearest training instances, where distance is measured with a tree-ensemble kernel. The runtime of IBUG depends on the number of training examples at each leaf in the ensemble, and can be improved by sampling trees or training instances. Empirically, we find that IBUG achieves similar or better performance than the previous state-of-the-art across 22 benchmark regression datasets. We also find that IBUG can achieve improved probabilistic performance by using different base GBRT models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStatistical Methods and Inference · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification
MethodsBalanced Selection
