SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task
Tao Yu, Michihiro Yasunaga, Kai Yang, Rui Zhang, Dongxu Wang, Zifan, Li, Dragomir Radev

TL;DR
SyntaxSQLNet introduces a syntax tree-based neural network for complex, cross-domain text-to-SQL tasks, significantly improving accuracy on the challenging Spider benchmark by leveraging syntax-aware decoding and data augmentation.
Contribution
The paper presents the first approach specifically designed for complex, cross-domain text-to-SQL generation, utilizing syntax tree decoding and cross-domain augmentation techniques.
Findings
Outperforms previous models by 7.3% in exact match accuracy.
Achieves an additional 7.5% improvement through data augmentation.
Handles more complex SQL queries with multiple clauses and nested subqueries.
Abstract
Most existing studies in text-to-SQL tasks do not require generating complex SQL queries with multiple clauses or sub-queries, and generalizing to new, unseen databases. In this paper we propose SyntaxSQLNet, a syntax tree network to address the complex and cross-domain text-to-SQL generation task. SyntaxSQLNet employs a SQL specific syntax tree-based decoder with SQL generation path history and table-aware column attention encoders. We evaluate SyntaxSQLNet on the Spider text-to-SQL task, which contains databases with multiple tables and complex SQL queries with multiple SQL clauses and nested queries. We use a database split setting where databases in the test set are unseen during training. Experimental results show that SyntaxSQLNet can handle a significantly greater number of complex SQL examples than prior work, outperforming the previous state-of-the-art model by 7.3% in exact…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis
