TCSR-SQL: Towards Table Content-aware Text-to-SQL with Self-retrieval
Wenbo Xu, Liang Yan, Chuanyi Liu, Peiyi Han, Haifeng Zhu, Yong Xu, Yingwei Liang, Bob Zhang

TL;DR
TCSR-SQL introduces a self-retrieval approach leveraging LLMs to improve table content-aware Text-to-SQL translation by extracting data keywords, inferring schemas, and iteratively refining SQL queries, significantly outperforming existing methods.
Contribution
It proposes a novel self-retrieval framework that enhances content-aware Text-to-SQL generation using multi-round knowledge encoding and database search, with a new benchmark dataset.
Findings
Achieves at least 27.8% improvement in execution accuracy over state-of-the-art methods.
Effectively extracts data content keywords and infers schemas for better SQL generation.
Demonstrates robustness on a new question-related, table-content-aware benchmark dataset.
Abstract
Large Language Model-based (LLM-based) Text-to-SQL methods have achieved important progress in generating SQL queries for real-world applications. When confronted with table content-aware questions in real-world scenarios, ambiguous data content keywords and nonexistent database schema column names within the question lead to the poor performance of existing methods. To solve this problem, we propose a novel approach towards Table Content-aware Text-to-SQL with Self-Retrieval (TCSR-SQL). It leverages LLM's in-context learning capability to extract data content keywords within the question and infer possible related database schema, which is used to generate Seed SQL to fuzz search databases. The search results are further used to confirm the encoding knowledge with the designed encoding knowledge table, including column names and exact stored content values used in the SQL. The encoding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Semantic Web and Ontologies · Advanced Computational Techniques and Applications
