Maximum Risk Minimization with Random Forests
Francesco Freni, Anya Fries, Linus K\"uhne, Markus Reichstein, Jonas Peters

TL;DR
This paper introduces variants of random forests designed for out-of-distribution generalization by minimizing maximum risk across different environments, with theoretical guarantees and empirical validation.
Contribution
It proposes computationally efficient MaxRM-based random forest algorithms with statistical consistency and out-of-sample guarantees for OOD settings.
Findings
Algorithms are computationally efficient.
Proven statistical consistency of the methods.
Validated on simulated and real-world data.
Abstract
We consider a regression setting where observations are collected in different environments modeled by different data distributions. The field of out-of-distribution (OOD) generalization aims to design methods that generalize better to test environments whose distributions differ from those observed during training. One line of such works has proposed to minimize the maximum risk across environments, a principle that we refer to as MaxRM (Maximum Risk Minimization). In this work, we introduce variants of random forests based on the principle of MaxRM. We provide computationally efficient algorithms and prove statistical consistency for our primary method. Our proposed method can be used with each of the following three risks: the mean squared error, the negative reward, and the regret (which quantifies the excess risk relative to the best predictor). For MaxRM with regret as the risk,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research · Statistical Methods and Inference
