Statistical Advantages of Oblique Randomized Decision Trees and Forests

Eliza O'Reilly

arXiv:2407.02458·math.ST·November 5, 2025

Statistical Advantages of Oblique Randomized Decision Trees and Forests

Eliza O'Reilly

PDF

Open Access

TL;DR

This paper introduces oblique Mondrian trees and forests that use linear combinations of features for data partitioning, providing theoretical analysis of their statistical properties, convergence rates, and robustness in dimension reduction tasks.

Contribution

It develops a theoretical framework for oblique Mondrian trees and forests, analyzing their generalization error, convergence, and robustness, and compares their optimality to axis-aligned methods.

Findings

01

Oblique Mondrian estimators achieve minimax optimal rates under certain conditions.

02

Risk bounds depend on feature estimation accuracy.

03

Axis-aligned Mondrian trees are suboptimal for ridge functions.

Abstract

This work studies the statistical implications of using features comprised of general linear combinations of covariates to partition the data in randomized decision tree and forest regression algorithms. Using random tessellation theory in stochastic geometry, we provide a theoretical analysis of a class of efficiently generated random tree and forest estimators that allow for oblique splits along such features. We call these estimators \emph{oblique Mondrian} trees and forests, as the trees are generated by first selecting a set of features from linear combinations of the covariates and then running a Mondrian process that hierarchically partitions the data along these features. Generalization error bounds and convergence rates are obtained for the flexible function class of multi-index models for dimension reduction, where the output is assumed to depend on a low-dimensional relevant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Neural Networks and Applications

MethodsSparse Evolutionary Training