GenRewrite: Query Rewriting via Large Language Models
Jie Liu, Barzan Mozafari

TL;DR
GenRewrite leverages large language models and novel techniques to improve query rewriting, outperforming traditional and baseline methods on standard benchmarks by reducing errors and manual effort.
Contribution
It introduces a holistic LLM-based query rewriting system with Natural Language Rewrite Rules and a counterexample-guided correction method, advancing beyond traditional rule-based and synthesis approaches.
Findings
GenRewrite outperforms all baselines on TPC-DS and JOB benchmarks.
It improves 25 queries at >=2x speedup on TPC-DS, 1.35x more than LLM baselines.
The system reduces errors and manual effort in query rewriting.
Abstract
Query rewriting is an effective technique for refining poorly written queries before they reach the query optimizer. However, manual rewriting is not scalable, as it is prone to errors and requires deep expertise. Traditional query rewriting algorithms fall short too: rule-based approaches fail to generalize to new query patterns, while synthesis-based methods struggle with complex queries. Fortunately, Large Language Models (LLMs) already possess broad knowledge and advanced reasoning capabilities, making them a promising solution for tackling these longstanding challenges. In this paper, we present GenRewrite, the first holistic system that leverages LLMs for query rewriting beyond traditional rules. We introduce the notion of Natural Language Rewrite Rules (NLR2s), which serve as hints for the LLM while also a means of knowledge transfer from rewriting one query to another, allowing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Semantic Web and Ontologies · Topic Modeling
