Near Optimal Decision Trees in a SPLIT Second
Varun Babbar, Hayden McTavish, Cynthia Rudin, Margo Seltzer

TL;DR
This paper introduces SPLIT algorithms that efficiently find near-optimal decision trees with high accuracy and scalability, balancing the benefits of greedy and optimal methods for interpretable machine learning.
Contribution
The paper presents a novel family of algorithms called SPLIT that significantly improves the speed of optimal decision tree search while maintaining high accuracy, and extends to compute near-optimal tree sets.
Findings
SPLIT algorithms are orders of magnitude faster than existing optimal methods.
Near-optimal trees achieve accuracy close to fully optimal solutions.
Scalable computation of the Rashomon set of trees is demonstrated.
Abstract
Decision tree optimization is fundamental to interpretable machine learning. The most popular approach is to greedily search for the best feature at every decision point, which is fast but provably suboptimal. Recent approaches find the global optimum using branch and bound with dynamic programming, showing substantial improvements in accuracy and sparsity at great cost to scalability. An ideal solution would have the accuracy of an optimal method and the scalability of a greedy method. We introduce a family of algorithms called SPLIT (SParse Lookahead for Interpretable Trees) that moves us significantly forward in achieving this ideal balance. We demonstrate that not all sub-problems need to be solved to optimality to find high quality trees; greediness suffices near the leaves. Since each depth adds an exponential number of possible trees, this change makes our algorithms orders of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fault Detection and Control Systems · Machine Learning and Data Classification
MethodsLookahead
