Content Enhanced BERT-based Text-to-SQL Generation
Tong Guo, Huilin Gao

TL;DR
This paper introduces a simple method to incorporate table content into BERT-based models for text-to-SQL tasks, improving accuracy by leveraging matching words between questions and table data.
Contribution
The authors propose a novel approach that encodes additional features based on content matching, enhancing BERT-based models for text-to-SQL without complex modifications.
Findings
Outperforms BERT baseline by 3.7% in logic form accuracy
Achieves state-of-the-art results on WikiSQL dataset
Method benefits inference due to consistent table data between training and testing
Abstract
We present a simple methods to leverage the table content for the BERT-based model to solve the text-to-SQL problem. Based on the observation that some of the table content match some words in question string and some of the table header also match some words in question string, we encode two addition feature vector for the deep model. Our methods also benefit the model inference in testing time as the tables are almost the same in training and testing time. We test our model on the WikiSQL dataset and outperform the BERT-based baseline by 3.7% in logic form and 3.7% in execution accuracy and achieve state-of-the-art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Scientific Computing and Data Management
MethodsTest
