TL;DR
ExeSQL introduces an execution-driven bootstrapping approach for self-taught text-to-SQL models, enabling effective adaptation to multiple SQL dialects through iterative, feedback-guided learning and execution-based filtering.
Contribution
The paper presents a novel framework that leverages execution feedback and agentic bootstrapping to improve multi-dialect text-to-SQL models without relying on high-quality dialect-specific datasets.
Findings
Achieves 15.2% improvement on PostgreSQL
Achieves 10.38% improvement on MySQL
Achieves 4.49% improvement on Oracle
Abstract
Recent text-to-SQL models have achieved strong performance, but their effectiveness remains largely confined to SQLite due to dataset limitations. However, real-world applications require SQL generation across multiple dialects with varying syntax and specialized features, which remains a challenge for current models. The main obstacle in building a dialect-aware model lies in acquiring high-quality dialect-specific data. Data generated purely through static prompting - without validating SQLs via execution - tends to be noisy and unreliable. Moreover, the lack of real execution environments in the training loop prevents models from grounding their predictions in executable semantics, limiting generalization despite surface-level improvements from data filtering. This work introduces ExeSQL, a text-to-SQL framework with execution-driven, agentic bootstrapping. The method consists of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
