SynSQL: Synthesizing Relational Databases for Robust Evaluation of Text-to-SQL Systems

Mohammadamin Habibollah; Davood Rafiei

arXiv:2604.27261·cs.DB·May 1, 2026

SynSQL: Synthesizing Relational Databases for Robust Evaluation of Text-to-SQL Systems

Mohammadamin Habibollah, Davood Rafiei

PDF

TL;DR

SynSQL is a framework that synthesizes relational databases conditioned on natural language questions to evaluate and stress-test text-to-SQL systems beyond static benchmarks.

Contribution

It introduces a novel method for generating semantically aligned databases from questions, revealing model weaknesses and advancing structured data synthesis techniques.

Findings

01

Synthetic databases cause 3-14% performance drops in models.

02

SynSQL exposes errors masked by benchmark artifacts.

03

Analysis highlights strengths and limitations of LLMs in data synthesis.

Abstract

Evaluating text-to-SQL systems remains largely fragile: correctness is typically judged by executing predicted and gold SQL queries on a single static database, even though the same queries may behave differently under alternative database instances. This raises a broader language modeling question: Can large language models synthesize semantically meaningful, schema-consistent relational data directly from a natural language question? If so, such generation can serve as a controlled mechanism for stress-testing text-to-SQL systems beyond fixed benchmark databases. We introduce SynSQL, a framework that synthesizes test databases conditioned on question-schema alignment rather than gold SQL queries. SynSQL decomposes the task into three stages: (1) schema selection, (2) question-guided data synthesis, and (3) constraint-aware critique with iterative refinement, framing database…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.