From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems

Ali Mohammadjafari; Anthony S. Maida; Raju Gottumukkala

arXiv:2410.01066·cs.CL·February 5, 2025·3 cites

From Natural Language to SQL: Review of LLM-based Text-to-SQL Systems

Ali Mohammadjafari, Anthony S. Maida, Raju Gottumukkala

PDF

Open Access

TL;DR

This paper reviews the evolution of LLM-based text-to-SQL systems, emphasizing recent advancements with Retrieval Augmented Generation, benchmarking, and challenges like efficiency and privacy.

Contribution

It provides a comprehensive survey of the development of LLM-based text-to-SQL systems, including novel insights into Graph RAGs and evaluation methods.

Findings

01

LLMs with RAG significantly improve SQL query accuracy.

02

Graph RAGs enhance contextual understanding and schema linking.

03

Key challenges include computational efficiency and data privacy.

Abstract

LLMs when used with Retrieval Augmented Generation (RAG), are greatly improving the SOTA of translating natural language queries to structured and correct SQL. Unlike previous reviews, this survey provides a comprehensive study of the evolution of LLM-based text-to-SQL systems, from early rule-based models to advanced LLM approaches that use (RAG) systems. We discuss benchmarks, evaluation methods, and evaluation metrics. Also, we uniquely study the use of Graph RAGs for better contextual accuracy and schema linking in these systems. Finally, we highlight key challenges such as computational efficiency, model robustness, and data privacy toward improvements of LLM-based text-to-SQL systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computational Techniques and Applications