ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Yang Qin, Chao Chen, Zhihang Fu, Ze Chen, Dezhong Peng, Peng Hu, Jieping Ye

TL;DR
ROUTE is a novel approach that enhances open-source LLMs for Text-to-SQL by multi-task fine-tuning and collaborative prompting, significantly improving SQL query generation accuracy and robustness.
Contribution
The paper introduces a comprehensive multi-task fine-tuning framework and a collaborative prompting strategy to improve open-source LLMs for Text2SQL tasks, addressing hallucination issues.
Findings
Outperforms existing Text2SQL methods on multiple benchmarks.
Enhances SQL syntax understanding through diverse SFT tasks.
Reduces hallucinations via collaborative prompting strategies.
Abstract
Despite the significant advancements in Text-to-SQL (Text2SQL) facilitated by large language models (LLMs), the latest state-of-the-art techniques are still trapped in the in-context learning of closed-source LLMs (e.g., GPT-4), which limits their applicability in open scenarios. To address this challenge, we propose a novel RObust mUltitask Tuning and collaboration mEthod (ROUTE) to improve the comprehensive capabilities of open-source LLMs for Text2SQL, thereby providing a more practical solution. Our approach begins with multi-task supervised fine-tuning (SFT) using various synthetic training data related to SQL generation. Unlike existing SFT-based Text2SQL methods, we introduced several additional SFT tasks, including schema linking, noise correction, and continuation writing. Engaging in a variety of SQL generation tasks enhances the model's understanding of SQL syntax and…
Peer Reviews
Decision·ICLR 2025 Poster
1) The proposed method significantly improves the performance of open-source LLMs and outperforms all existing methods trained on open-source LLMs. 2) The proposed MCP approach not only enhances the performance of models trained with MSFT but also improves other models. 3) The novel MSFT method substantially boosts model performance compared to standard SFT.
1) Although this paper focuses more on open-source LLMs, some recent approaches, such as CHASE-SQL, Distillery, and CHESS, are not included as benchmarks in their experiments. 2) The proposed approach is a multi-step pipeline that can be prone to error propagation. To better understand the performance of the schema linking module and ensure it is not introducing errors into the pipeline, it would be beneficial to report the precision and recall of the schema linking module, as done in CHESS and
1. The paper introduces a multitask learning approach that leverages several text-to-SQL related tasks. Noise Correction is designed to assess whether the execution result of a SQL query correctly answers the question, reducing hallucinations when paired with multi-turn generation. 2. ROUTE demonstrates competitive accuracy, outperforming some closed-source methods on benchmarks, thus showcasing the effectiveness of multitask training over single-task fine-tuning. 3. The authors provide comprehe
1. The paper lacks an ablation study on the contribution of each task in MSFT. For instance, the loss from continuation writing is likely already included in text-to-SQL learning after the first token of the SQL prediction. It is unclear how each task directly benefits SQL generation and other inference components. 2. Although Noise Correction helps improve performance, it relies on execution results within the model, which may be difficult to apply to queries with large outputs, such as selecti
1. The paper incorporates multiple tasks to enhance text-to-SQL capabilities, making the LLM more versatile and capable of handling complex SQL generation scenarios. 2. The paper evaluates ROUTE on several well-known benchmarks and compares its performance with other prompting and fine-tuning methods, demonstrating its effectiveness in real-world applications.
1. The authors mention that "Most training-based methods only incorporate the ⟨Question, SQL⟩ pairs for SFT, resulting in degraded performance in other tasks, such as schema linking." However, our approach usually incorporates a ⟨Question, Schema, SQL⟩ tuple for SFT. Additionally, a reduction in schema linking performance cannot be seen as a limitation of existing methods. If a specific task is not included in training, optimal results for that task are not expected. Therefore, this should not b
Code & Models
Videos
Taxonomy
TopicsAdvanced Database Systems and Queries
MethodsShrink and Fine-Tune
