Beyond SELECT: A Comprehensive Taxonomy-Guided Benchmark for Real-World Text-to-SQL Translation
Hao Wang, Yuanfeng Song, Xiaoming Yin, Xing Chen

TL;DR
This paper introduces a taxonomy-guided approach to create a diverse and comprehensive Text-to-SQL benchmark dataset, SQL-Synth, to better evaluate and improve LLM performance on real-world applications.
Contribution
It proposes a new taxonomy for Text-to-SQL tasks, uses it to synthesize a diverse dataset with LLMs, and demonstrates its effectiveness over existing datasets.
Findings
Existing datasets lack diversity and coverage.
SQL-Synth outperforms previous benchmarks in diversity.
Fine-tuning improves LLM performance on complex scenarios.
Abstract
Text-to-SQL datasets are essential for training and evaluating text-to-SQL models, but existing datasets often suffer from limited coverage and fail to capture the diversity of real-world applications. To address this, we propose a novel taxonomy for text-to-SQL classification based on dimensions including core intents, statement types, syntax structures, and key actions. Using this taxonomy, we evaluate widely used public text-to-SQL datasets (e.g., Spider and Bird) and reveal limitations in their coverage and diversity. We then introduce a taxonomy-guided dataset synthesis pipeline, yielding a new dataset named SQL-Synth. This approach combines the taxonomy with Large Language Models (LLMs) to ensure the dataset reflects the breadth and complexity of real-world text-to-SQL applications. Extensive analysis and experimental results validate the effectiveness of our taxonomy, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Machine Learning and Data Classification · Topic Modeling
