TL;DR
PATSQL is a novel algorithm that efficiently synthesizes SQL queries from example tables by leveraging relational algebra properties to quickly infer projected columns, significantly reducing computational costs.
Contribution
The paper introduces a new relational algebra-based approach for fast SQL query synthesis, enabling quick inference of projected columns and efficient sketch completion.
Findings
Solved 68% of benchmark queries
Found 89% solutions within one second
Outperforms prior methods in scalability
Abstract
SQL is one of the most popular tools for data analysis, and it is now used by an increasing number of users without having expertise in databases. Several studies have proposed programming-by-example approaches to help such non-experts to write correct SQL queries. While existing methods support a variety of SQL features such as aggregation and nested query, they suffer a significant increase in computational cost as the scale of example tables increases. In this paper, we propose an efficient algorithm utilizing properties known in relational algebra to synthesize SQL queries from input and output tables. Our key insight is that a projection operator in a program sketch can be lifted above other operators by applying transformation rules in relational algebra, while preserving the semantics of the program. This enables a quick inference of appropriate columns in the projection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
