Knapsack Optimization-based Schema Linking for LLM-based Text-to-SQL Generation

Zheng Yuan; Hao Chen; Zijin Hong; Qinggang Zhang; Feiran Huang; Qing Li; Xiao Huang

arXiv:2502.12911·cs.CL·April 23, 2026

Knapsack Optimization-based Schema Linking for LLM-based Text-to-SQL Generation

Zheng Yuan, Hao Chen, Zijin Hong, Qinggang Zhang, Feiran Huang, Qing Li, Xiao Huang

PDF

1 Repo

TL;DR

This paper introduces KaSLA, a knapsack optimization-based schema linking method that enhances SQL generation from user queries by accurately identifying relevant schema elements and reducing redundancy.

Contribution

KaSLA is a novel plug-in schema linking approach that employs hierarchical linking and knapsack optimization to improve schema linking accuracy for Text2SQL tasks.

Findings

01

KaSLA outperforms existing schema linking methods on Spider and BIRD benchmarks.

02

KaSLA significantly improves SQL generation performance of state-of-the-art Text2SQL models.

03

The code for KaSLA is publicly available at the provided GitHub link.

Abstract

Generating SQLs from user queries is a long-standing challenge, where the accuracy of initial schema linking significantly impacts subsequent SQL generation performance. However, current schema linking models still struggle with missing relevant schema elements or an excess of redundant ones. A crucial reason for this is that commonly used metrics, recall and precision, fail to capture relevant element missing and thus cannot reflect actual schema linking performance. Motivated by this, we propose enhanced schema linking metrics by introducing a \textbf{restricted missing indicator}. Accordingly, we introduce \textbf{\underline{K}n\underline{a}psack optimization-based \underline{S}chema \underline{L}inking \underline{A}pproach (KaSLA)}, a plug-in schema linking method designed to prevent the missing of relevant schema elements while minimizing the inclusion of redundant ones. KaSLA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DEEP-PolyU/KaSLA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.