Learning Optimal Classification Trees Robust to Distribution Shifts
Nathan Justin, Sina Aghaei, Andr\'es G\'omez, Phebe Vayanos

TL;DR
This paper introduces a novel method for learning classification trees that are robust to distribution shifts, using mixed-integer robust optimization, significantly improving worst-case and average-case accuracy in high-stakes applications.
Contribution
It formulates the robust tree learning problem as a mixed-integer robust optimization, providing a tailored solution approach and demonstrating improved accuracy over non-robust trees.
Findings
Up to 12.48% increase in worst-case accuracy.
Up to 4.85% increase in average-case accuracy.
Effective on multiple publicly available datasets.
Abstract
We consider the problem of learning classification trees that are robust to distribution shifts between training and testing/deployment data. This problem arises frequently in high stakes settings such as public health and social work where data is often collected using self-reported surveys which are highly sensitive to e.g., the framing of the questions, the time when and place where the survey is conducted, and the level of comfort the interviewee has in sharing information with the interviewer. We propose a method for learning optimal robust classification trees based on mixed-integer robust optimization technology. In particular, we demonstrate that the problem of learning an optimal robust tree can be cast as a single-stage mixed-integer robust optimization problem with a highly nonlinear and discontinuous objective. We reformulate this problem equivalently as a two-stage linear…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Anomaly Detection Techniques and Applications
