SLM-SQL: An Exploration of Small Language Models for Text-to-SQL
Lei Sheng, Shuai-Shuai Xu

TL;DR
This paper investigates the potential of small language models for Text-to-SQL tasks by applying post-training techniques and a corrective self-consistency inference method, achieving significant performance improvements.
Contribution
The study introduces a novel approach combining supervised fine-tuning and reinforcement learning-based post-training for small language models in Text-to-SQL tasks, demonstrating their effectiveness.
Findings
0.5B model achieved 56.87% execution accuracy.
1.5B model achieved 67.08% execution accuracy.
Significant performance gains with small models using proposed methods.
Abstract
Large language models (LLMs) have demonstrated strong performance in translating natural language questions into SQL queries (Text-to-SQL). In contrast, small language models (SLMs) ranging from 0.5B to 1.5B parameters currently underperform on Text-to-SQL tasks due to their limited logical reasoning capabilities. However, SLMs offer inherent advantages in inference speed and suitability for edge deployment. To explore their potential in Text-to-SQL applications, we leverage recent advancements in post-training techniques. Specifically, we used the open-source SynSQL-2.5M dataset to construct two derived datasets: SynSQL-Think-916K for SQL generation and SynSQL-Merge-Think-310K for SQL merge revision. We then applied supervised fine-tuning and reinforcement learning-based post-training to the SLM, followed by inference using a corrective self-consistency approach. Experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗cycloneboy/SLM-SQL-Base-0.5Bmodel· 34 dl· ♡ 334 dl♡ 3
- 🤗cycloneboy/SLM-SQL-0.5Bmodel· 81 dl· ♡ 281 dl♡ 2
- 🤗cycloneboy/CscSQL-Merge-Qwen2.5-Coder-0.5B-Instructmodel· 5 dl5 dl
- 🤗cycloneboy/SLM-SQL-Base-1.5Bmodel· 112 dl· ♡ 2112 dl♡ 2
- 🤗cycloneboy/CscSQL-Merge-Qwen2.5-Coder-1.5B-Instructmodel· 14 dl14 dl
- 🤗cycloneboy/SLM-SQL-Base-0.6Bmodel· 87 dl87 dl
- 🤗cycloneboy/SLM-SQL-0.6Bmodel· 29 dl· ♡ 129 dl♡ 1
- 🤗cycloneboy/SLM-SQL-Base-1.3Bmodel· 9 dl· ♡ 19 dl♡ 1
- 🤗cycloneboy/SLM-SQL-1.3Bmodel· 8 dl· ♡ 18 dl♡ 1
- 🤗cycloneboy/SLM-SQL-Base-1Bmodel· 30 dl30 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
