The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of Decision Trees
Jonathan Bodine, Dorit S. Hochbaum

TL;DR
This paper introduces the Max-Cut decision tree, which enhances classification accuracy and reduces computation time by using a novel splitting metric and PCA-based feature selection, especially effective on high-dimensional data.
Contribution
The paper proposes a new decision tree method combining Max-Cut splitting and PCA-based feature selection, significantly improving performance and efficiency over traditional CART trees.
Findings
49% accuracy improvement on CIFAR-100
94% reduction in CPU time
Significant gains on high-dimensional, multi-class datasets
Abstract
Decision trees are a widely used method for classification, both by themselves and as the building blocks of multiple different ensemble learning methods. The Max-Cut decision tree involves novel modifications to a standard, baseline model of classification decision tree construction, precisely CART Gini. One modification involves an alternative splitting metric, maximum cut, based on maximizing the distance between all pairs of observations belonging to separate classes and separate sides of the threshold value. The other modification is to select the decision feature from a linear combination of the input features constructed using Principal Component Analysis (PCA) locally at each node. Our experiments show that this node-based localized PCA with the novel splitting modification can dramatically improve classification, while also significantly decreasing computational time compared…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification · Imbalanced Data Classification Techniques
MethodsPrincipal Components Analysis
