Text-to-SQL based on Large Language Models and Database Keyword Search
Eduardo R. Nascimento (1, 3), Caio Viktor S. Avila (1, 4),, Yenier T. Izquierdo (1), Grettel M. Garc\'ia (1), Lucas Feij\'o L. Andrade, (1), Michelle S.P. Facina (2), Melissa Lemos (1, 3), Marco A. Casanova (1, and 3) ((1) Instituto Tecgraf, PUC-Rio, Rio de Janeiro, Brazil, (2)

TL;DR
This paper introduces a novel Text-to-SQL approach that combines Large Language Models with a database keyword search platform, improving accuracy on real-world databases by enhancing schema-linking and join synthesis.
Contribution
It proposes a dynamic few-shot example strategy and leverages a keyword search platform to improve schema-linking and join synthesis in Text-to-SQL tasks for real-world databases.
Findings
Achieves higher accuracy than state-of-the-art methods on real-world database benchmarks.
Improves schema-linking precision and recall through keyword matching.
Simplifies SQL query generation by synthesizing views for complex joins.
Abstract
Text-to-SQL prompt strategies based on Large Language Models (LLMs) achieve remarkable performance on well-known benchmarks. However, when applied to real-world databases, their performance is significantly less than for these benchmarks, especially for Natural Language (NL) questions requiring complex filters and joins to be processed. This paper then proposes a strategy to compile NL questions into SQL queries that incorporates a dynamic few-shot examples strategy and leverages the services provided by a database keyword search (KwS) platform. The paper details how the precision and recall of the schema-linking process are improved with the help of the examples provided and the keyword-matching service that the KwS platform offers. Then, it shows how the KwS platform can be used to synthesize a view that captures the joins required to process an input NL question and thereby simplify…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computational Techniques and Applications · Service-Oriented Architecture and Web Services · Web Data Mining and Analysis
Methodstravel james
