End-to-End Text-to-SQL with Dataset Selection: Leveraging LLMs for Adaptive Query Generation
Anurag Tripathi, Vaibhav Patle, Abhinav Jain, Ayush Pundir, Sairam Menon, Ajeet Kumar Singh, Dorien Herremans

TL;DR
This paper introduces an end-to-end framework that uses large language models and prompt engineering to identify the correct database before generating SQL queries, improving accuracy in multi-database scenarios.
Contribution
It presents a novel three-stage approach combining LLMs, a database prediction model, and error correction to enhance text-to-SQL performance in complex database environments.
Findings
Outperforms state-of-the-art in database intent prediction
Achieves higher SQL generation accuracy
Effective in multi-database query scenarios
Abstract
Text-to-SQL bridges the gap between natural language and structured database language, thus allowing non-technical users to easily query databases. Traditional approaches model text-to-SQL as a direct translation task, where a given Natural Language Query (NLQ) is mapped to an SQL command. Recent advances in large language models (LLMs) have significantly improved translation accuracy, however, these methods all require that the target database is pre-specified. This becomes problematic in scenarios with multiple extensive databases, where identifying the correct database becomes a crucial yet overlooked step. In this paper, we propose a three-stage end-to-end text-to-SQL framework to identify the user's intended database before generating SQL queries. Our approach leverages LLMs and prompt engineering to extract implicit information from natural language queries (NLQs) in the form of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
