Learned Offline Query Planning via Bayesian Optimization
Jeffrey Tao, Natalie Maus, Haydn Jones, Yimeng Zeng, Jacob R. Gardner,, Ryan Marcus

TL;DR
This paper introduces an offline query optimization method that uses Bayesian optimization and variational auto-encoders to find faster query plans for repeated queries, outperforming traditional and RL-based systems.
Contribution
It presents a novel offline query optimizer combining variational auto-encoders with Bayesian optimization to improve plan quality for repeated queries.
Findings
Our method finds faster query plans than PostgreSQL's optimal plans.
It outperforms recent RL-based query optimization systems.
The approach is effective across multiple datasets.
Abstract
Analytics database workloads often contain queries that are executed repeatedly. Existing optimization techniques generally prioritize keeping optimization cost low, normally well below the time it takes to execute a single instance of a query. If a given query is going to be executed thousands of times, could it be worth investing significantly more optimization time? In contrast to traditional online query optimizers, we propose an offline query optimizer that searches a wide variety of plans and incorporates query execution as a primitive. Our offline query optimizer combines variational auto-encoders with Bayesian optimization to find optimized plans for a given query. We compare our technique to the optimal plans possible with PostgreSQL and recent RL-based systems over several datasets, and show that our technique finds faster query plans.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Database Systems and Queries · Data Management and Algorithms
