Simple is better: Making Decision Trees faster using random sampling
Vignesh Nanda Kumar, Narayanan U Edakunni

TL;DR
This paper demonstrates that simple random sampling for selecting split points in decision trees can match or outperform complex quantile-based methods in accuracy and efficiency, simplifying the process.
Contribution
The paper proves both theoretically and empirically that uniform random sampling of split points is as effective or better than sophisticated quantile methods for distributed decision tree construction.
Findings
Random sampling achieves comparable accuracy to quantile methods.
Random sampling reduces computational complexity.
Simpler methods can replace complex quantile algorithms.
Abstract
In recent years, gradient boosted decision trees have become popular in building robust machine learning models on big data. The primary technique that has enabled these algorithms success has been distributing the computation while building the decision trees. A distributed decision tree building, in turn, has been enabled by building quantiles of the big datasets and choosing the candidate split points from these quantile sets. In XGBoost, for instance, a sophisticated quantile building algorithm is employed to identify the candidate split points for the decision trees. This method is often projected to yield better results when the computation is distributed. In this paper, we dispel the notion that these methods provide more accurate and scalable methods for building decision trees in a distributed manner. In a significant contribution, we show theoretically and empirically that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Explainable Artificial Intelligence (XAI)
