Runtime Optimizations for Prediction with Tree-Based Models

Nima Asadi; Jimmy Lin; and Arjen P. de Vries

arXiv:1212.2287·cs.DB·April 29, 2013·5 cites

Runtime Optimizations for Prediction with Tree-Based Models

Nima Asadi, Jimmy Lin, and Arjen P. de Vries

PDF

Open Access

TL;DR

This paper presents techniques to optimize the runtime prediction performance of tree-based models by improving memory layout, reducing branches, and employing vectorization to better utilize modern processor architectures.

Contribution

It introduces architecture-aware methods such as cache-conscious data structures, predication, and vectorization to significantly accelerate tree-based model predictions.

Findings

01

Speedup over traditional implementations

02

Enhanced cache utilization

03

Effective use of vectorization techniques

Abstract

Tree-based models have proven to be an effective solution for web ranking as well as other problems in diverse domains. This paper focuses on optimizing the runtime performance of applying such models to make predictions, given an already-trained model. Although exceedingly simple conceptually, most implementations of tree-based models do not efficiently utilize modern superscalar processor architectures. By laying out data structures in memory in a more cache-conscious fashion, removing branches from the execution flow using a technique called predication, and micro-batching predictions using a technique called vectorization, we are able to better exploit modern processor architectures and significantly improve the speed of tree-based models over hard-coded if-else blocks. Our work contributes to the exploration of architecture-conscious runtime implementations of machine learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Data Stream Mining Techniques · Parallel Computing and Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings