Robust Optimal Classification Trees under Noisy Labels
V\'ictor Blanco, Alberto Jap\'on, Justo Puerto

TL;DR
This paper introduces a new method for building optimal classification trees that are robust to noisy labels by integrating SVM-inspired splitting and label noise detection within a mixed integer nonlinear programming framework.
Contribution
The paper presents a novel approach combining margin-based splitting and label noise correction in optimal classification trees, formulated as a mixed integer nonlinear program.
Findings
Effective in handling noisy labels on UCI datasets
Outperforms traditional classification trees in noisy environments
Demonstrates robustness and improved accuracy
Abstract
In this paper we propose a novel methodology to construct Optimal Classification Trees that takes into account that noisy labels may occur in the training sample. Our approach rests on two main elements: (1) the splitting rules for the classification trees are designed to maximize the separation margin between classes applying the paradigm of SVM; and (2) some of the labels of the training sample are allowed to be changed during the construction of the tree trying to detect the label noise. Both features are considered and integrated together to design the resulting Optimal Classification Tree. We present a Mixed Integer Non Linear Programming formulation for the problem, suitable to be solved using any of the available off-the-shelf solvers. The model is analyzed and tested on a battery of standard datasets taken from UCI Machine Learning repository, showing the effectiveness of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
