Locally Optimized Random Forests
Tim Coleman, Kimberly Kaufeld, Mary Frances Dorn, Lucas Mentch

TL;DR
This paper introduces Locally Optimized Random Forests, a method that adapts traditional random forests to handle distributional shifts between training and test data, especially useful for extreme event prediction.
Contribution
It proposes a weighted random forest approach using importance sampling to account for distributional differences between training and test data.
Findings
Effective in handling covariate shift in predictive modeling.
Improves forecasting accuracy for extreme events like hurricanes.
Provides a data-driven adaptation for machine learning under distributional changes.
Abstract
Standard supervised learning procedures are validated against a test set that is assumed to have come from the same distribution as the training data. However, in many problems, the test data may have come from a different distribution. We consider the case of having many labeled observations from one distribution, , and making predictions at unlabeled points that come from . We combine the high predictive accuracy of random forests (Breiman, 2001) with an importance sampling scheme, where the splits and predictions of the base-trees are done in a weighted manner, which we call Locally Optimized Random Forests. These weights correspond to a non-parametric estimate of the likelihood ratio between the training and test distributions. To estimate these ratios with an unlabeled test set, we make the covariate shift assumption, where the differences in distribution are only a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHydrology and Drought Analysis · Anomaly Detection Techniques and Applications · Hydrological Forecasting Using AI
