Natural SQL: Making SQL Easier to Infer from Natural Language Specifications
Yujian Gan, Xinyun Chen, Jinxia Xie, Matthew Purver, John, R. Woodward, John Drake, Qiaofu Zhang

TL;DR
This paper introduces Natural SQL (NatSQL), an intermediate representation that simplifies SQL queries to improve natural language to SQL translation, leading to better performance on complex benchmarks and enabling existing models to generate executable SQL.
Contribution
The paper proposes NatSQL, a simplified SQL IR that enhances text-to-SQL translation accuracy and compatibility with existing models, especially on complex queries.
Findings
NatSQL outperforms other IRs on the Spider benchmark.
NatSQL significantly improves state-of-the-art models' performance.
NatSQL enables non-executable models to generate executable SQL with higher accuracy.
Abstract
Addressing the mismatch between natural language descriptions and the corresponding SQL queries is a key challenge for text-to-SQL translation. To bridge this gap, we propose an SQL intermediate representation (IR) called Natural SQL (NatSQL). Specifically, NatSQL preserves the core functionalities of SQL, while it simplifies the queries as follows: (1) dispensing with operators and keywords such as GROUP BY, HAVING, FROM, JOIN ON, which are usually hard to find counterparts for in the text descriptions; (2) removing the need for nested subqueries and set operators; and (3) making schema linking easier by reducing the required number of schema items. On Spider, a challenging text-to-SQL benchmark that contains complex and nested SQL queries, we demonstrate that NatSQL outperforms other IRs, and significantly improves the performance of several previous SOTA models. Furthermore, for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies
