Generalised Boosted Forests

Indrayudh Ghosal; Giles Hooker

arXiv:2102.12561·stat.ME·March 4, 2021

Generalised Boosted Forests

Indrayudh Ghosal, Giles Hooker

PDF

Open Access

TL;DR

This paper introduces a generalized boosting method for random forests to model non-Gaussian responses, improving estimation accuracy and providing variance estimates with real-world and simulated data.

Contribution

It extends boosting random forests to handle non-Gaussian responses using an MLE-based approach and residual fitting, with an efficient variance estimation method.

Findings

01

Reduces test-set log-likelihood in experiments

02

Effectively reduces bias in estimates

03

Provides conservative confidence interval coverage

Abstract

This paper extends recent work on boosting random forests to model non-Gaussian responses. Given an exponential family $E [Y ∣ X] = g^{- 1} (f (X))$ our goal is to obtain an estimate for $f$ . We start with an MLE-type estimate in the link space and then define generalised residuals from it. We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest. We call the sum of these three estimators a \textit{generalised boosted forest}. We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric. We also provide a variance estimator, which we can obtain with the same computational cost as the original estimate itself. Empirical experiments on real-world data and simulations demonstrate that the methods can effectively reduce bias,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Probabilistic and Robust Engineering Design · Statistical Methods and Bayesian Inference