TL;DR
This paper introduces DuoRAT, a simplified text-to-SQL model based on relation-aware transformers, demonstrating that certain complex features are redundant, thus streamlining the model without sacrificing performance.
Contribution
DuoRAT is a simplified, re-implemented state-of-the-art text-to-SQL model using only relation-aware or vanilla transformers, with ablation studies identifying essential and redundant features.
Findings
Simplified model maintains performance with fewer features
Structural SQL features and question-schema links are often redundant
Relation-aware transformers are effective building blocks
Abstract
Recent neural text-to-SQL models can effectively translate natural language questions to corresponding SQL queries on unseen databases. Working mostly on the Spider dataset, researchers have proposed increasingly sophisticated solutions to the problem. Contrary to this trend, in this paper we focus on simplifications. We begin by building DuoRAT, a re-implementation of the state-of-the-art RAT-SQL model that unlike RAT-SQL is using only relation-aware or vanilla transformers as the building blocks. We perform several ablation experiments using DuoRAT as the baseline model. Our experiments confirm the usefulness of some techniques and point out the redundancy of others, including structural SQL features and features that link the question with the schema.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
