Statistical-Computational Trade-offs for Recursive Adaptive Partitioning Estimators

Yan Shuo Tan; Jason M. Klusowski; Krishnakumar Balasubramanian

arXiv:2411.04394·stat.ML·September 11, 2025

Statistical-Computational Trade-offs for Recursive Adaptive Partitioning Estimators

Yan Shuo Tan, Jason M. Klusowski, Krishnakumar Balasubramanian

PDF

Open Access

TL;DR

This paper investigates the computational and statistical limits of recursive adaptive partitioning models like decision trees, revealing a sharp dichotomy based on the Merged Staircase Property and comparing their performance to neural networks.

Contribution

It establishes a new understanding of when greedy recursive partitioning algorithms succeed or fail, highlighting a statistical-computational trade-off and introducing novel proof techniques.

Findings

01

Greedy algorithms require exponential samples if the true function does not satisfy MSP.

02

When MSP holds, greedy methods achieve low error with logarithmic samples.

03

ERM-trained estimators always require only logarithmic samples regardless of MSP.

Abstract

Models based on recursive adaptive partitioning such as decision trees and their ensembles are popular for high-dimensional regression as they can potentially avoid the curse of dimensionality. Because empirical risk minimization (ERM) is computationally infeasible, these models are typically trained using greedy algorithms. Although effective in many cases, these algorithms have been empirically observed to get stuck at local optima. We explore this phenomenon in the context of learning sparse regression functions over $d$ binary features, showing that when the true regression function $f^{*}$ does not satisfy Abbe et al. (2022)'s Merged Staircase Property (MSP), greedy training requires $exp (Ω (d))$ to achieve low estimation error. Conversely, when $f^{*}$ does satisfy MSP, greedy training can attain small estimation error with only $O (lo g d)$ samples. This dichotomy mirrors that of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Bayesian Methods and Mixture Models