Knowledge Base Construction for Knowledge-Augmented Text-to-SQL
Jinheon Baek, Horst Samulowitz, Oktie Hassanzadeh, Dharmashankar Subramanian, Sola Shirai, Alfio Gliozzo, and Debarun Bhattacharjya

TL;DR
This paper introduces a comprehensive knowledge base construction method for text-to-SQL tasks, enhancing the accuracy of SQL generation by grounding queries in a reusable, domain-agnostic knowledge source, outperforming existing approaches.
Contribution
It proposes a novel, comprehensive knowledge base construction approach that leverages all available questions and schemas, improving generalization and accuracy in text-to-SQL translation.
Findings
Outperforms relevant baselines on multiple datasets
Effective in both overlapping and non-overlapping database scenarios
Enhances SQL accuracy by grounding in a knowledge base
Abstract
Text-to-SQL aims to translate natural language queries into SQL statements, which is practical as it enables anyone to easily retrieve the desired information from databases. Recently, many existing approaches tackle this problem with Large Language Models (LLMs), leveraging their strong capability in understanding user queries and generating corresponding SQL code. Yet, the parametric knowledge in LLMs might be limited to covering all the diverse and domain-specific queries that require grounding in various database schemas, which makes generated SQLs less accurate oftentimes. To tackle this, we propose constructing the knowledge base for text-to-SQL, a foundational source of knowledge, from which we retrieve and generate the necessary knowledge for given queries. In particular, unlike existing approaches that either manually annotate knowledge or generate only a few pieces of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Advanced Database Systems and Queries · Topic Modeling
