Seq2SQL: Generating Structured Queries from Natural Language using   Reinforcement Learning

Victor Zhong; Caiming Xiong; and Richard Socher

arXiv:1709.00103·cs.CL·November 13, 2017·786 cites

Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

Victor Zhong, Caiming Xiong, and Richard Socher

PDF

Open Access 5 Repos 4 Models 5 Datasets

TL;DR

Seq2SQL is a neural network model that translates natural language questions into SQL queries, leveraging query structure and reinforcement learning to improve accuracy, supported by a large annotated dataset called WikiSQL.

Contribution

Introduces Seq2SQL, a novel deep learning approach using reinforcement learning and a new large dataset for translating natural language to SQL queries.

Findings

01

Improved execution accuracy from 35.9% to 59.4%.

02

Enhanced logical form accuracy from 23.4% to 48.3%.

03

Outperforms previous sequence-to-sequence models.

Abstract

A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such as SQL. We propose Seq2SQL, a deep neural network for translating natural language questions to corresponding SQL queries. Our model leverages the structure of SQL queries to significantly reduce the output space of generated queries. Moreover, we use rewards from in-the-loop query execution over the database to learn a policy to generate unordered parts of the query, which we show are less suitable for optimization via cross entropy loss. In addition, we will publish WikiSQL, a dataset of 80654 hand-annotated examples of questions and SQL queries distributed across 24241 tables from Wikipedia. This dataset is required to train our model and is an order of magnitude larger than…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Data Stream Mining Techniques · Data Management and Algorithms