Confidence Intervals for the Generalisation Error of Random Forests

Samyak Rajanala; Stephen Bates; Trevor Hastie; Robert Tibshirani

arXiv:2201.11210·stat.ME·January 28, 2022·1 cites

Confidence Intervals for the Generalisation Error of Random Forests

Samyak Rajanala, Stephen Bates, Trevor Hastie, Robert Tibshirani

PDF

Open Access

TL;DR

This paper introduces improved confidence intervals for estimating the generalisation error of random forests using bootstrap-based methods, enhancing the reliability of out-of-bag error estimates without additional computational cost.

Contribution

It proposes novel confidence interval techniques based on the delta-method-after-bootstrap and jackknife-after-bootstrap for random forests, improving coverage accuracy.

Findings

01

Enhanced coverage properties over naive intervals

02

Effective in both real and simulated data

03

No need for additional trees to compute intervals

Abstract

Out-of-bag error is commonly used as an estimate of generalisation error in ensemble-based learning models such as random forests. We present confidence intervals for this quantity using the delta-method-after-bootstrap and the jackknife-after-bootstrap techniques. These methods do not require growing any additional trees. We show that these new confidence intervals have improved coverage properties over the naive confidence interval, in real and simulated examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Gaussian Processes and Bayesian Inference