Superfast Selection for Decision Tree Algorithms
Huaduo Wang, Gopal Gupta

TL;DR
This paper introduces Superfast Selection, a method that significantly accelerates decision tree split selection and feature selection, eliminating pre-encoding needs and enabling ultrafast training and tuning on large datasets.
Contribution
The paper presents Superfast Selection, a novel method that reduces split selection complexity and integrates into CART to create Ultrafast Decision Trees with rapid training and tuning capabilities.
Findings
Superfast Selection reduces split selection time complexity from O(MN) to O(M).
Ultrafast Decision Tree trains on large datasets within seconds.
Tuning hyper-parameters is significantly faster with the proposed method.
Abstract
We present a novel and systematic method, called Superfast Selection, for selecting the "optimal split" for decision tree and feature selection algorithms over tabular data. The method speeds up split selection on a single feature by lowering the time complexity, from O(MN) (using the standard selection methods) to O(M), where M represents the number of input examples and N the number of unique values. Additionally, the need for pre-encoding, such as one-hot or integer encoding, for feature value heterogeneity is eliminated. To demonstrate the efficiency of Superfast Selection, we empower the CART algorithm by integrating Superfast Selection into it, creating what we call Ultrafast Decision Tree (UDT). This enhancement enables UDT to complete the training process with a time complexity O(KM) (K is the number of features). Additionally, the Training Only Once Tuning enables UDT to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Mining Algorithms and Applications · Rough Sets and Fuzzy Logic · Fuzzy Logic and Control Systems
MethodsFeature Selection
