An improved column-generation-based matheuristic for learning classification trees
Krunal Kishor Patel, Guy Desaulniers, Andrea Lodi

TL;DR
This paper enhances a column-generation-based heuristic for learning decision trees, improving scalability and efficiency for large multiclass datasets by modifying subproblems, using implied constraints as cuts, and generating violated constraints.
Contribution
The paper introduces specific modifications to the existing column-generation approach, including subproblem reduction, constraint implication, and separation models, to improve scalability in multiclass classification.
Findings
Enhanced scalability demonstrated on large datasets
Reduced number of subproblems in multiclass cases
Improved computational performance over previous methods
Abstract
Decision trees are highly interpretable models for solving classification problems in machine learning (ML). The standard ML algorithms for training decision trees are fast but generate suboptimal trees in terms of accuracy. Other discrete optimization models in the literature address the optimality problem but only work well on relatively small datasets. \cite{firat2020column} proposed a column-generation-based heuristic approach for learning decision trees. This approach improves scalability and can work with large datasets. In this paper, we describe improvements to this column generation approach. First, we modify the subproblem model to significantly reduce the number of subproblems in multiclass classification instances. Next, we show that the data-dependent constraints in the master problem are implied, and use them as cutting planes. Furthermore, we describe a separation model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Explainable Artificial Intelligence (XAI) · Metaheuristic Optimization Algorithms Research
