RSL-SQL: Robust Schema Linking in Text-to-SQL Generation
Zhenbiao Cao, Yuanlei Zheng, Zhihao Fan, Xiaojin Zhang, Wei Chen,, Xiang Bai

TL;DR
RSL-SQL introduces a robust schema linking framework for Text-to-SQL tasks, improving recall and accuracy by combining bidirectional linking, contextual augmentation, and self-correction, achieving state-of-the-art results on benchmark datasets.
Contribution
The paper presents a novel RSL-SQL framework that enhances schema linking in Text-to-SQL generation through multiple strategies, significantly improving recall and accuracy over existing methods.
Findings
Achieves 94% recall in pattern linking with reduced input columns by 83%.
Attains state-of-the-art execution accuracy on BIRD (67.2%) and Spider (87.9%) benchmarks.
Outperforms GPT-4 based systems using cheaper models with the same prompts.
Abstract
Text-to-SQL generation aims to translate natural language questions into SQL statements. In Text-to-SQL based on large language models, schema linking is a widely adopted strategy to streamline the input for LLMs by selecting only relevant schema elements, therefore reducing noise and computational overhead. However, schema linking faces risks that require caution, including the potential omission of necessary elements and disruption of database structural integrity. To address these challenges, we propose a novel framework called RSL-SQL that combines bidirectional schema linking, contextual information augmentation, binary selection strategy, and multi-turn self-correction. We improve the recall of pattern linking using forward and backward pruning methods, achieving a strict recall of 94% while reducing the number of input columns by 83%. Furthermore, it hedges the risk by voting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Mathematics, Computing, and Information Processing · Logic, programming, and type systems
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Position-Wise Feed-Forward Layer · Adam · Multi-Head Attention · Residual Connection · Byte Pair Encoding · Dropout · Absolute Position Encodings
