Stability of Random Forests and Coverage of Random-Forest Prediction   Intervals

Yan Wang; Huaiqing Wu; Dan Nettleton

arXiv:2310.18814·stat.ML·October 31, 2023·5 cites

Stability of Random Forests and Coverage of Random-Forest Prediction Intervals

Yan Wang, Huaiqing Wu, Dan Nettleton

PDF

Open Access 1 Video

TL;DR

This paper proves the stability of random forests under certain conditions and demonstrates that they can reliably produce accurate prediction intervals with justified coverage probabilities.

Contribution

It establishes the stability of random forests under mild conditions and derives bounds for the coverage of prediction intervals, extending understanding of their reliability.

Findings

01

Stability of random forests holds under mild tail conditions.

02

Prediction intervals from random forests have quantifiable coverage guarantees.

03

Empirical results suggest stability persists even beyond theoretical assumptions.

Abstract

We establish stability of random forests under the mild condition that the squared response ( $Y^{2}$ ) does not have a heavy tail. In particular, our analysis holds for the practical version of random forests that is implemented in popular packages like \texttt{randomForest} in \texttt{R}. Empirical results show that stability may persist even beyond our assumption and hold for heavy-tailed $Y^{2}$ . Using the stability property, we prove a non-asymptotic lower bound for the coverage probability of prediction intervals constructed from the out-of-bag error of random forests. With another mild condition that is typically satisfied when $Y$ is continuous, we also establish a complementary upper bound, which can be similarly established for the jackknife prediction interval constructed from an arbitrary stable algorithm. We also discuss the asymptotic coverage probability under assumptions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Stability of Random Forests and Coverage of Random-Forest Prediction Intervals· slideslive

Taxonomy

TopicsProbabilistic and Robust Engineering Design · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms