Journey of Migrating Millions of Queries on The Cloud
Taro L. Saito, Naoki Takezoe, Yukihiro Okada, Takako Shimamoto,, Dongmin Yu, Suprith Chandrashekharachar, Kai Sasaki, Shohei Okumiya, Yan, Wang, Takashi Kurihara, Ryu Kobayashi, Keisuke Suzuki, Zhenghong Yang, Makoto, Onizuka

TL;DR
This paper discusses the challenges and solutions involved in migrating millions of distributed SQL queries to a new cloud-based query engine, ensuring correctness and performance through customer-specific benchmarking and simulation.
Contribution
It introduces a system for migrating large-scale queries that includes customer-specific benchmarking, query set minimization, and performance regression detection.
Findings
Effective query set minimization improves testing efficiency.
Customer-specific benchmarks help detect incompatibilities.
Simulation results aid in identifying performance regressions.
Abstract
Treasure Data is processing millions of distributed SQL queries every day on the cloud. Upgrading the query engine service at this scale is challenging because we need to migrate all of the production queries of the customers to a new version while preserving the correctness and performance of the data processing pipelines. To ensure the quality of the query engines, we utilize our query logs to build customer-specific benchmarks and replay these queries with real customer data in a secure pre-production environment. To simulate millions of queries, we need effective minimization of test query sets and better reporting of the simulation results to proactively find incompatible changes and performance regression of the new version. This paper describes the overall design of our system and shares various challenges in maintaining the quality of the query engine service on the cloud.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
