End-to-end Optimization of Machine Learning Prediction Queries
Kwanghyun Park, Karla Saur, Dalitso Banda, Rathijit Sen, Matteo, Interlandi, Konstantinos Karanasos

TL;DR
Raven is a system that optimizes prediction queries by unifying data and ML operators into a single graph, enabling logical and physical optimizations across runtimes and hardware for significant performance improvements.
Contribution
It introduces a unified intermediate representation for prediction queries, enabling cross-operator optimizations and runtime/hardware selection, which was not previously explored.
Findings
Up to 13.1x performance improvement on Apache Spark
Up to 330x speedup on SQL Server
Up to 8x acceleration for complex models using GPU
Abstract
Prediction queries are widely used across industries to perform advanced analytics and draw insights from data. They include a data processing part (e.g., for joining, filtering, cleaning, featurizing the datasets) and a machine learning (ML) part invoking one or more trained models to perform predictions. These parts have so far been optimized in isolation, leaving significant opportunities for optimization unexplored. We present Raven, a production-ready system for optimizing prediction queries. Raven follows the enterprise architectural trend of collocating data and ML runtimes. It relies on a unified intermediate representation that captures both data and ML operators in a single graph structure to unlock two families of optimizations. First, it employs logical optimizations that pass information between the data part (and the properties of the underlying data) and the ML part to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
