RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL
Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang,, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song

TL;DR
RB-SQL is a retrieval-based framework that enhances text-to-SQL performance by efficiently selecting relevant schema and examples for in-context learning, outperforming existing methods on public datasets.
Contribution
It introduces a novel retrieval-based approach for prompt engineering in text-to-SQL tasks, addressing scalability and efficiency issues of prior methods.
Findings
Achieves superior performance on BIRD and Spider datasets.
Effectively retrieves relevant schema and examples for improved reasoning.
Outperforms several competitive baselines.
Abstract
Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task. Previous works generally focus on using exclusive SQL generation prompt to improve the LLMs' reasoning ability. However, they are mostly hard to handle large databases with numerous tables and columns, and usually ignore the significance of pre-processing database and extracting valuable information for more efficient prompt engineering. Based on above analysis, we propose RB-SQL, a novel retrieval-based LLM framework for in-context prompt engineering, which consists of three modules that retrieve concise tables and columns as schema, and targeted examples for in-context learning. Experiment results demonstrate that our model achieves better performance than several competitive baselines on public datasets BIRD and Spider.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computational Techniques and Applications
MethodsFocus
