From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems
Ali Mohammadjafari, Anthony S. Maida, Raju Gottumukkala

TL;DR
This paper reviews the evolution of LLM-based text-to-SQL systems, emphasizing recent advancements with Retrieval Augmented Generation, benchmarking, and challenges like efficiency and privacy.
Contribution
It provides a comprehensive survey of the development of LLM-based text-to-SQL systems, including novel insights into Graph RAGs and evaluation methods.
Findings
LLMs with RAG significantly improve SQL query accuracy.
Graph RAGs enhance contextual understanding and schema linking.
Key challenges include computational efficiency and data privacy.
Abstract
LLMs when used with Retrieval Augmented Generation (RAG), are greatly improving the SOTA of translating natural language queries to structured and correct SQL. Unlike previous reviews, this survey provides a comprehensive study of the evolution of LLM-based text-to-SQL systems, from early rule-based models to advanced LLM approaches that use (RAG) systems. We discuss benchmarks, evaluation methods, and evaluation metrics. Also, we uniquely study the use of Graph RAGs for better contextual accuracy and schema linking in these systems. Finally, we highlight key challenges such as computational efficiency, model robustness, and data privacy toward improvements of LLM-based text-to-SQL systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computational Techniques and Applications
