Metasql: A Generate-then-Rank Framework for Natural Language to SQL Translation
Yuankai Fan, Zhenying He, Tonghui Ren, Can Huang, Yinan Jing, Kai Zhang, X.Sean Wang

TL;DR
Metasql is a generate-then-rank framework that enhances natural language to SQL translation by controlling candidate generation with metadata and selecting the best query through learning-to-rank methods, improving accuracy on benchmarks.
Contribution
Introduces a flexible generate-then-rank framework with query metadata and learning-to-rank algorithms to improve NL to SQL translation accuracy.
Findings
Significantly improves translation accuracy on benchmark datasets.
Effectively controls SQL candidate generation with metadata.
Outperforms existing auto-regressive models in NLIDB tasks.
Abstract
The Natural Language Interface to Databases (NLIDB) empowers non-technical users with database access through intuitive natural language (NL) interactions. Advanced approaches, utilizing neural sequence-to-sequence models or large-scale language models, typically employ auto-regressive decoding to generate unique SQL queries sequentially. While these translation models have greatly improved the overall translation accuracy, surpassing 70% on NLIDB benchmarks, the use of auto-regressive decoding to generate single SQL queries may result in sub-optimal outputs, potentially leading to erroneous translations. In this paper, we propose Metasql, a unified generate-then-rank framework that can be flexibly incorporated with existing NLIDBs to consistently improve their translation accuracy. Metasql introduces query metadata to control the generation of better SQL query candidates and uses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Mathematics, Computing, and Information Processing · Advanced Database Systems and Queries
MethodsSparse Evolutionary Training
