Odyssey: An End-to-End System for Pareto-Optimal Serverless Query Processing
Shyam Jesalpura, Shengda Zhu, Amir Shaikhha, Antonio Barbalace, Boris Grot

TL;DR
Odyssey is an end-to-end serverless data analytics system that automatically generates and evaluates query plans to optimize cost and performance, outperforming existing solutions like AWS Athena.
Contribution
It introduces a novel integrated pipeline with a query planner, cost model, and execution engine that finds Pareto-optimal plans for serverless analytics.
Findings
Accurately predicts cost and latency.
Outperforms AWS Athena in cost and latency.
Effectively balances cost and performance for complex queries.
Abstract
Running data analytics queries on serverless (FaaS) workers has been shown to be cost- and performance-efficient for a variety of real-world scenarios, including intermittent query arrival patterns, sudden load spikes and management challenges that afflict managed VM clusters. Alas, existing serverless data analytics works focus primarily on the serverless execution engine and assume the existence of a "good" query execution plan or rely on user guidance to construct such a plan. Meanwhile, even simple analytics queries on serverless have a huge space of possible plans, with vast differences in both performance and cost among plans. This paper introduces Odyssey, an end-to-end serverless-native data analytics pipeline that integrates a query planner, cost model and execution engine. Odyssey automatically generates and evaluates serverless query plans, utilizing state space pruning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
