CSR-RAG: An Efficient Retrieval System for Text-to-SQL on the Enterprise Scale
Rajpreet Singh, Novak Bo\v{s}kov, Lawrence Drabeck, Aditya Gudal, Manzoor A. Khan

TL;DR
CSR-RAG is a hybrid retrieval system designed for enterprise-scale Text-to-SQL tasks, achieving high accuracy and low latency, suitable for large databases and LLM integration.
Contribution
The paper introduces CSR-RAG, a novel hybrid retrieval approach that enhances efficiency and accuracy for enterprise-scale Text-to-SQL translation.
Findings
Achieves up to 40% precision and 80% recall on enterprise benchmarks.
Maintains a low average query latency of 30ms on commodity hardware.
Suitable for integration with modern LLM-based enterprise systems.
Abstract
Natural language to SQL translation (Text-to-SQL) is one of the long-standing problems that has recently benefited from advances in Large Language Models (LLMs). While most academic Text-to-SQL benchmarks request schema description as a part of natural language input, enterprise-scale applications often require table retrieval before SQL query generation. To address this need, we propose a novel hybrid Retrieval Augmented Generation (RAG) system consisting of contextual, structural, and relational retrieval (CSR-RAG) to achieve computationally efficient yet sufficiently accurate retrieval for enterprise-scale databases. Through extensive enterprise benchmarks, we demonstrate that CSR-RAG achieves up to 40% precision and over 80% recall while incurring a negligible average query generation latency of only 30ms on commodity data center hardware, which makes it appropriate for modern…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWeb Data Mining and Analysis · Advanced Database Systems and Queries · Natural Language Processing Techniques
