TL;DR
This paper introduces a novel bi-objective optimization algorithm to compute provably optimal decision trees tailored for nonlinear metrics like F1-score, addressing a significant gap in current decision tree methods.
Contribution
It presents the first method to generate optimal decision trees for nonlinear metrics by treating misclassifications as separate objectives and exploring the Pareto frontier.
Findings
The method computes optimal trees for nonlinear metrics.
Runtimes are reasonable for most datasets.
The approach improves performance according to nonlinear metrics.
Abstract
Nonlinear metrics, such as the F1-score, Matthews correlation coefficient, and Fowlkes-Mallows index, are often used to evaluate the performance of machine learning models, in particular, when facing imbalanced datasets that contain more samples of one class than the other. Recent optimal decision tree algorithms have shown remarkable progress in producing trees that are optimal with respect to linear criteria, such as accuracy, but unfortunately nonlinear metrics remain a challenge. To address this gap, we propose a novel algorithm based on bi-objective optimisation, which treats misclassifications of each binary class as a separate objective. We show that, for a large class of metrics, the optimal tree lies on the Pareto frontier. Consequently, we obtain the optimal tree by using our method to generate the set of all nondominated trees. To the best of our knowledge, this is the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
