ASTormer: An AST Structure-aware Transformer Decoder for Text-to-SQL
Ruisheng Cao, Hanchong Zhang, Hongshen Xu, Jieyu Li, Da Ma, Lu Chen, and Kai Yu

TL;DR
ASTormer introduces a Transformer-based decoder that efficiently generates well-structured SQL ASTs by incorporating structural priors, outperforming RNN-based methods in accuracy and speed across multiple benchmarks.
Contribution
The paper presents a novel Transformer decoder for text-to-SQL that effectively integrates AST structure priors, replacing traditional RNN decoders for improved performance.
Findings
Outperforms baselines on five benchmarks
More efficient than RNN-based decoders
Effectively incorporates AST structure priors
Abstract
Text-to-SQL aims to generate an executable SQL program given the user utterance and the corresponding database schema. To ensure the well-formedness of output SQLs, one prominent approach adopts a grammar-based recurrent decoder to produce the equivalent SQL abstract syntax tree (AST). However, previous methods mainly utilize an RNN-series decoder, which 1) is time-consuming and inefficient and 2) introduces very few structure priors. In this work, we propose an AST structure-aware Transformer decoder (ASTormer) to replace traditional RNN cells. The structural knowledge, such as node types and positions in the tree, is seamlessly incorporated into the decoder via both absolute and relative position embeddings. Besides, the proposed framework is compatible with different traversing orders even considering adaptive node selection. Extensive experiments on five text-to-SQL benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Advanced Database Systems and Queries · Semantic Web and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Label Smoothing · Residual Connection · Byte Pair Encoding · Softmax · Dense Connections · Dropout
