PACSET (Packed Serialized Trees): Reducing Inference Latency for Tree Ensemble Deployment
Meghana Madhyastha, Kunal Lillaney, James Browne, Joshua Vogelstein,, Randal Burns

TL;DR
This paper introduces PACSET, a serialization method for tree ensembles that reduces inference latency by optimizing data layout for external memory, significantly improving performance on resource-constrained devices.
Contribution
PACSET presents a novel serialization technique that encodes reference locality in tree ensembles, reducing I/O and inference latency on low-resource and on-demand deployment scenarios.
Findings
Achieves 2-6 times reduction in classification latency.
Optimizes data layout for I/O blocksize and popular paths.
Effective for models larger than memory or on resource-limited devices.
Abstract
We present methods to serialize and deserialize tree ensembles that optimize inference latency when models are not already loaded into memory. This arises whenever models are larger than memory, but also systematically when models are deployed on low-resource devices, such as in the Internet of Things, or run as Web micro-services where resources are allocated on demand. Our packed serialized trees (PACSET) encode reference locality in the layout of a tree ensemble using principles from external memory algorithms. The layout interleaves correlated nodes across multiple trees, uses leaf cardinality to collocate the nodes on the most popular paths and is optimized for the I/O blocksize. The result is that each I/O yields a higher fraction of useful data, leading to a 2-6 times reduction in classification latency for interactive workloads.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Time Series Analysis and Forecasting
