Optimal Generalized Decision Trees via Integer Programming
Oktay Gunluk, Jayant Kalagnanam, Minhan Li, Matt Menickelly, Katya, Scheinberg

TL;DR
This paper introduces a mixed integer programming approach to construct optimal decision trees that balance accuracy, interpretability, and robustness, effectively handling categorical and numerical features for small to medium-sized datasets.
Contribution
It presents a novel optimization-based method for building optimal decision trees of fixed size, incorporating feature subset decisions and handling both categorical and numerical data.
Findings
High accuracy with small trees on moderate datasets
Optimization problems are tractable with modern solvers
Method improves interpretability and robustness of decision trees
Abstract
Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a mixed integer programming formulation to construct optimal decision trees of a prespecified size. We take the special structure of categorical features into account and allow combinatorial decisions (based on subsets of values of features) at each node. Our approach can also handle numerical features via thresholding. We show that very good accuracy can be achieved with small trees using moderately-sized training sets. The optimization problems we solve are tractable with modern solvers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Bayesian Modeling and Causal Inference · Imbalanced Data Classification Techniques
MethodsInterpretability
