Bootstrap Bias Corrections for Ensemble Methods
Giles Hooker, Lucas Mentch

TL;DR
This paper introduces a residual bootstrap bias correction technique for ensemble regression methods, significantly improving their bias and predictive accuracy with minimal additional computational cost.
Contribution
It proposes a novel bootstrap bias correction method that enhances ensemble methods like random forests without increasing variance or computational burden.
Findings
Bias correction improves test-set accuracy by up to 70%
Method is computationally efficient, doubling training cost at most
Significant bias and accuracy improvements demonstrated empirically
Abstract
This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods. Accounting for bias is an important obstacle in recent efforts to develop statistical inference for machine learning methods. We demonstrate empirically that the proposed bootstrap bias correction can lead to substantial improvements in both bias and predictive accuracy. In the context of ensembles of trees, we show that this correction can be approximated at only double the cost of training the original ensemble without introducing additional variance. Our method is shown to improve test-set accuracy over random forests by up to 70\% on example problems from the UCI repository.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Statistical Methods and Models · Statistical Methods and Inference
