RS-ORT: A Reduced-Space Branch-and-Bound Algorithm for Optimal Regression Trees
Cristobal Heredia, Pedro Chumpitaz-Flores, Kaixun Hua

TL;DR
RS-ORT introduces a novel branch-and-bound algorithm for optimal regression trees that efficiently handles large-scale continuous data, guaranteeing global optimality and superior performance compared to existing methods.
Contribution
The paper proposes RS-ORT, a reduced-space branch-and-bound algorithm that exclusively branches on tree-structural variables, ensuring convergence and scalability for large datasets.
Findings
RS-ORT outperforms state-of-the-art methods on regression benchmarks.
It guarantees training performance with simpler trees on large datasets.
The algorithm scales to datasets with up to 2 million samples within four hours.
Abstract
Mixed-integer programming (MIP) has emerged as a powerful framework for learning optimal decision trees. Yet, existing MIP approaches for regression tasks are either limited to purely binary features or become computationally intractable when continuous, large-scale data are involved. Naively binarizing continuous features sacrifices global optimality and often yields needlessly deep trees. We recast the optimal regression-tree training as a two-stage optimization problem and propose Reduced-Space Optimal Regression Trees (RS-ORT) - a specialized branch-and-bound (BB) algorithm that branches exclusively on tree-structural variables. This design guarantees the algorithm's convergence and its independence from the number of training samples. Leveraging the model's structure, we introduce several bound tightening techniques - closed-form leaf prediction, empirical threshold discretization,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
