Statistical Advantages of Oblique Randomized Decision Trees and Forests
Eliza O'Reilly

TL;DR
This paper introduces oblique Mondrian trees and forests that use linear combinations of features for data partitioning, providing theoretical analysis of their statistical properties, convergence rates, and robustness in dimension reduction tasks.
Contribution
It develops a theoretical framework for oblique Mondrian trees and forests, analyzing their generalization error, convergence, and robustness, and compares their optimality to axis-aligned methods.
Findings
Oblique Mondrian estimators achieve minimax optimal rates under certain conditions.
Risk bounds depend on feature estimation accuracy.
Axis-aligned Mondrian trees are suboptimal for ridge functions.
Abstract
This work studies the statistical implications of using features comprised of general linear combinations of covariates to partition the data in randomized decision tree and forest regression algorithms. Using random tessellation theory in stochastic geometry, we provide a theoretical analysis of a class of efficiently generated random tree and forest estimators that allow for oblique splits along such features. We call these estimators \emph{oblique Mondrian} trees and forests, as the trees are generated by first selecting a set of features from linear combinations of the covariates and then running a Mondrian process that hierarchically partitions the data along these features. Generalization error bounds and convergence rates are obtained for the flexible function class of multi-index models for dimension reduction, where the output is assumed to depend on a low-dimensional relevant…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Neural Networks and Applications
MethodsSparse Evolutionary Training
