Locally Optimized Random Forests

Tim Coleman; Kimberly Kaufeld; Mary Frances Dorn; Lucas Mentch

arXiv:1908.09967·stat.ML·August 28, 2019

Locally Optimized Random Forests

Tim Coleman, Kimberly Kaufeld, Mary Frances Dorn, Lucas Mentch

PDF

Open Access 1 Repo

TL;DR

This paper introduces Locally Optimized Random Forests, a method that adapts traditional random forests to handle distributional shifts between training and test data, especially useful for extreme event prediction.

Contribution

It proposes a weighted random forest approach using importance sampling to account for distributional differences between training and test data.

Findings

01

Effective in handling covariate shift in predictive modeling.

02

Improves forecasting accuracy for extreme events like hurricanes.

03

Provides a data-driven adaptation for machine learning under distributional changes.

Abstract

Standard supervised learning procedures are validated against a test set that is assumed to have come from the same distribution as the training data. However, in many problems, the test data may have come from a different distribution. We consider the case of having many labeled observations from one distribution, $P_{1}$ , and making predictions at unlabeled points that come from $P_{2}$ . We combine the high predictive accuracy of random forests (Breiman, 2001) with an importance sampling scheme, where the splits and predictions of the base-trees are done in a weighted manner, which we call Locally Optimized Random Forests. These weights correspond to a non-parametric estimate of the likelihood ratio between the training and test distributions. To estimate these ratios with an unlabeled test set, we make the covariate shift assumption, where the differences in distribution are only a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tim-coleman/Locally-Optimized-Random-Forests
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHydrology and Drought Analysis · Anomaly Detection Techniques and Applications · Hydrological Forecasting Using AI