Type- and Content-Driven Synthesis of SQL Queries from Natural Language
Navid Yaghmazadeh, Yuepeng Wang, Isil Dillig, Thomas Dillig

TL;DR
This paper introduces Sqlizer, an automated method combining natural language processing, program synthesis, and repair techniques to generate accurate SQL queries from English descriptions without schema knowledge.
Contribution
It proposes a novel synthesis and repair loop that improves SQL query generation from natural language, applicable to any database without customization.
Findings
Top 5 candidate ranking in 90% of cases
Effective fault localization and repair in query synthesis
Works across multiple database schemas
Abstract
This paper presents a new technique for automatically synthesizing SQL queries from natural language. Our technique is fully automated, works for any database without requiring additional customization, and does not require users to know the underlying database schema. Our method achieves these goals by combining natural language processing, program synthesis, and automated program repair. Given the user's English description, our technique first uses semantic parsing to generate a query sketch, which is subsequently completed using type-directed program synthesis and assigned a confidence score using database contents. However, since the user's description may not accurately reflect the actual database schema, our approach also performs fault localization and repairs the erroneous part of the sketch. This synthesize-repair loop is repeated until the algorithm infers a query with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLogic, programming, and type systems · Software Testing and Debugging Techniques · Software Engineering Research
