Formal Hypothesis Tests for Additive Structure in Random Forests
Lucas Mentch, Giles Hooker

TL;DR
This paper introduces the first formal hypothesis tests for assessing variable importance and additive structure in random forests, leveraging ensemble properties and random projections for efficient, high-power inference.
Contribution
It develops novel statistical tests for additive structure in random forests, utilizing grid structures and random projections to enable efficient and powerful inference.
Findings
First tests for regression structure in random forests.
Variance estimation integrated into ensemble construction.
Random projections enable scalable testing with large grids.
Abstract
While statistical learning methods have proved powerful tools for predictive modeling, the black-box nature of the models they produce can severely limit their interpretability and the ability to conduct formal inference. However, the natural structure of ensemble learners like bagged trees and random forests has been shown to admit desirable asymptotic properties when base learners are built with proper subsamples. In this work, we demonstrate that by defining an appropriate grid structure on the covariate space, we may carry out formal hypothesis tests for both variable importance and underlying additive model structure. To our knowledge, these tests represent the first statistical tools for investigating the underlying regression structure in a context such as random forests. We develop notions of total and partial additivity and further demonstrate that testing can be carried out at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Neural Networks and Applications · Bayesian Modeling and Causal Inference
MethodsInterpretability
