On Variance Estimation of Random Forests with Infinite-Order   U-statistics

Tianning Xu; Ruoqing Zhu; Xiaofeng Shao

arXiv:2202.09008·stat.ML·February 16, 2023·1 cites

On Variance Estimation of Random Forests with Infinite-Order U-statistics

Tianning Xu, Ruoqing Zhu, Xiaofeng Shao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel variance estimator for random forests based on a new Hoeffding decomposition perspective, ensuring unbiasedness, ratio consistency, and improved finite-sample performance for confidence intervals.

Contribution

It proposes a new unbiased variance estimator using a peak region dominance view, establishing ratio consistency and connecting with existing estimators.

Findings

01

The new estimator has lower bias in simulations.

02

It achieves targeted coverage rates for confidence intervals.

03

The method is theoretically justified with ratio consistency.

Abstract

Infinite-order U-statistics (IOUS) has been used extensively on subbagging ensemble learning algorithms such as random forests to quantify its uncertainty. While normality results of IOUS have been studied extensively, its variance estimation approaches and theoretical properties remain mostly unexplored. Existing approaches mainly utilize the leading term dominance property in the Hoeffding decomposition. However, such a view usually leads to biased estimation when the kernel size is large or the sample size is small. On the other hand, while several unbiased estimators exist in the literature, their relationships and theoretical properties, especially the ratio consistency, have never been studied. These limitations lead to unguaranteed performances of constructed confidence intervals. To bridge these gaps in the literature, we propose a new view of the Hoeffding decomposition for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

teazrq/rlt
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Statistical Methods and Inference · Gaussian Processes and Bayesian Inference