Open-SQL Framework: Enhancing Text-to-SQL on Open-source Large Language Models
Xiaojun Chen, Tianle Wang, Tianhao Qiu, Jianbin Qin, Min, Yang

TL;DR
This paper introduces extours, a comprehensive framework for improving open-source large language models in Text-to-SQL tasks through evaluation, prompt strategies, fine-tuning, and token-efficient techniques, achieving significant performance gains.
Contribution
The paper presents a systematic methodology including evaluation, prompt design, fine-tuning strategies, and token-efficient techniques to enhance open-source LLMs for Text-to-SQL tasks, surpassing some proprietary models.
Findings
Llama2-7B improved from 2.54% to 41.04% accuracy.
Code Llama-7B improved from 14.54% to 48.24% accuracy.
Code Llama-7B outperformed GPT-4 on BIRD-Dev dataset.
Abstract
Despite the success of large language models (LLMs) in Text-to-SQL tasks, open-source LLMs encounter challenges in contextual understanding and response coherence. To tackle these issues, we present \ours, a systematic methodology tailored for Text-to-SQL with open-source LLMs. Our contributions include a comprehensive evaluation of open-source LLMs in Text-to-SQL tasks, the \openprompt strategy for effective question representation, and novel strategies for supervised fine-tuning. We explore the benefits of Chain-of-Thought in step-by-step inference and propose the \openexample method for enhanced few-shot learning. Additionally, we introduce token-efficient techniques, such as \textbf{Variable-length Open DB Schema}, \textbf{Target Column Truncation}, and \textbf{Example Column Truncation}, addressing challenges in large-scale databases. Our findings emphasize the need for further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Scientific Computing and Data Management · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Residual Connection · Absolute Position Encodings · Byte Pair Encoding · Adam · Dropout · Softmax
