SynSQL: Synthesizing Relational Databases for Robust Evaluation of Text-to-SQL Systems
Mohammadamin Habibollah, Davood Rafiei

TL;DR
SynSQL is a framework that synthesizes relational databases conditioned on natural language questions to evaluate and stress-test text-to-SQL systems beyond static benchmarks.
Contribution
It introduces a novel method for generating semantically aligned databases from questions, revealing model weaknesses and advancing structured data synthesis techniques.
Findings
Synthetic databases cause 3-14% performance drops in models.
SynSQL exposes errors masked by benchmark artifacts.
Analysis highlights strengths and limitations of LLMs in data synthesis.
Abstract
Evaluating text-to-SQL systems remains largely fragile: correctness is typically judged by executing predicted and gold SQL queries on a single static database, even though the same queries may behave differently under alternative database instances. This raises a broader language modeling question: Can large language models synthesize semantically meaningful, schema-consistent relational data directly from a natural language question? If so, such generation can serve as a controlled mechanism for stress-testing text-to-SQL systems beyond fixed benchmark databases. We introduce SynSQL, a framework that synthesizes test databases conditioned on question-schema alignment rather than gold SQL queries. SynSQL decomposes the task into three stages: (1) schema selection, (2) question-guided data synthesis, and (3) constraint-aware critique with iterative refinement, framing database…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
