Content Enhanced BERT-based Text-to-SQL Generation

Tong Guo; Huilin Gao

arXiv:1910.07179·cs.CL·April 23, 2020·46 cites

Content Enhanced BERT-based Text-to-SQL Generation

Tong Guo, Huilin Gao

PDF

Open Access 5 Repos

TL;DR

This paper introduces a simple method to incorporate table content into BERT-based models for text-to-SQL tasks, improving accuracy by leveraging matching words between questions and table data.

Contribution

The authors propose a novel approach that encodes additional features based on content matching, enhancing BERT-based models for text-to-SQL without complex modifications.

Findings

01

Outperforms BERT baseline by 3.7% in logic form accuracy

02

Achieves state-of-the-art results on WikiSQL dataset

03

Method benefits inference due to consistent table data between training and testing

Abstract

We present a simple methods to leverage the table content for the BERT-based model to solve the text-to-SQL problem. Based on the observation that some of the table content match some words in question string and some of the table header also match some words in question string, we encode two addition feature vector for the deep model. Our methods also benefit the model inference in testing time as the tables are almost the same in training and testing time. We test our model on the WikiSQL dataset and outperform the BERT-based baseline by 3.7% in logic form and 3.7% in execution accuracy and achieve state-of-the-art.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Scientific Computing and Data Management

MethodsTest