Reboost Large Language Model-based Text-to-SQL, Text-to-Python, and Text-to-Function -- with Real Applications in Traffic Domain
Guanghu Sui, Zhishuai Li, Ziyue Li, Sun Yang, Jingqing Ruan, Hangyu, Mao, Rui Zhao

TL;DR
This paper introduces a flexible prompting method for Large Language Models that significantly improves performance in Text-to-SQL, Text-to-Python, and Text-to-Function tasks, especially in complex real-world traffic domain applications.
Contribution
The paper presents a novel, adaptable prompting approach involving query rewriting and SQL boosting, enhancing LLM performance on diverse datasets beyond the state-of-the-art.
Findings
Performance on business dataset improved from 21.05 to 65.79 accuracy.
The method outperforms previous SOTA even with less capable models.
Analysis of Text-to-Python and Text-to-Function provides valuable insights.
Abstract
The previous state-of-the-art (SOTA) method achieved a remarkable execution accuracy on the Spider dataset, which is one of the largest and most diverse datasets in the Text-to-SQL domain. However, during our reproduction of the business dataset, we observed a significant drop in performance. We examined the differences in dataset complexity, as well as the clarity of questions' intentions, and assessed how those differences could impact the performance of prompting methods. Subsequently, We develop a more adaptable and more general prompting method, involving mainly query rewriting and SQL boosting, which respectively transform vague information into exact and precise information and enhance the SQL itself by incorporating execution feedback and the query results from the database content. In order to prevent information gaps, we include the comments, value types, and value samples for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Scientific Computing and Data Management · Web Data Mining and Analysis
