Pay One, Get Hundreds for Free: Reducing Cloud Costs through Shared Query Execution
Renato Marroqu\'in, Ingo M\"uller, Darko Makreshanski, Gustavo, Alonso

TL;DR
This paper presents a query rewriting technique that combines multiple concurrent queries into a single shared execution plan, significantly reducing cloud query costs without system modifications.
Contribution
It introduces a novel query rewriting method for shared execution in cloud data analysis, lowering costs and increasing throughput without altering existing systems.
Findings
Shared execution reduces total work done compared to individual queries.
Cost savings of up to 100x in Amazon Athena and Google BigQuery.
Higher throughput achieved with shared query execution.
Abstract
Cloud-based data analysis is nowadays common practice because of the lower system management overhead as well as the pay-as-you-go pricing model. The pricing model, however, is not always suitable for query processing as heavy use results in high costs. For example, in query-as-a-service systems, where users are charged per processed byte, collections of queries accessing the same data frequently can become expensive. The problem is compounded by the limited options for the user to optimize query execution when using declarative interfaces such as SQL. In this paper, we show how, without modifying existing systems and without the involvement of the cloud provider, it is possible to significantly reduce the overhead, and hence the cost, of query-as-a-service systems. Our approach is based on query rewriting so that multiple concurrent queries are combined into a single query. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
