JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning
Kaiwen Wang, Junxiong Wang, Yueying Li, Nathan Kallus, Immanuel, Trummer, Wen Sun

TL;DR
JoinGym is a lightweight, offline simulation environment for reinforcement learning-based join order selection, enabling rapid prototyping on realistic database workloads with a novel cardinality dataset.
Contribution
The paper introduces JoinGym, a new offline query optimization environment for RL that supports both left-deep and bushy joins, with a novel dataset for cardinality estimation.
Findings
RL algorithms exhibit heavy-tailed cost distributions
JoinGym significantly improves throughput over existing environments
The cardinality dataset aids in realistic query cost simulation
Abstract
Join order selection (JOS) is the problem of ordering join operations to minimize total query execution cost and it is the core NP-hard combinatorial optimization problem of query optimization. In this paper, we present JoinGym, a lightweight and easy-to-use query optimization environment for reinforcement learning (RL) that captures both the left-deep and bushy variants of the JOS problem. Compared to existing query optimization environments, the key advantages of JoinGym are usability and significantly higher throughput which we accomplish by simulating query executions entirely offline. Under the hood, JoinGym simulates a query plan's cost by looking up intermediate result cardinalities from a pre-computed dataset. We release a novel cardinality dataset for SQL queries based on real IMDb workloads which may be of independent interest, e.g., for cardinality estimation. Finally,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Complexity and Algorithms in Graphs · Constraint Satisfaction and Optimization
