Speculative Parallel Evaluation Of Classification Trees On GPGPU Compute Engines
Jason Spencer

TL;DR
This paper explores GPU-based speculative parallel algorithms for classification tree evaluation, demonstrating a 25% speedup over traditional data decomposition methods in real-time image segmentation tasks.
Contribution
It introduces a novel speculative parallel algorithm optimized for GPU architectures, improving classification tree evaluation speed in real-time applications.
Findings
Speculative algorithm reduces runtime by 25% compared to data decomposition.
GPU implementation outperforms serial CPU evaluation.
Various optimizations impact overall performance positively.
Abstract
We examine the problem of optimizing classification tree evaluation for on-line and real-time applications by using GPUs. Looking at trees with continuous attributes often used in image segmentation, we first put the existing algorithms for serial and data-parallel evaluation on solid footings. We then introduce a speculative parallel algorithm designed for single instruction, multiple data (SIMD) architectures commonly found in GPUs. A theoretical analysis shows how the run times of data and speculative decompositions compare assuming independent processors. To compare the algorithms in the SIMD environment, we implement both on a CUDA 2.0 architecture machine and compare timings to a serial CPU implementation. Various optimizations and their effects are discussed, and results are given for all algorithms. Our specific tests show a speculative algorithm improves run time by 25%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Neural Networks and Applications · Advanced Neural Network Applications
