On Repairing Natural Language to SQL Queries
Aidan Z.H. Yang, Ricardo Brancas, Pedro Esteves, Sofia Aparicio, Joao, Pedro Nadkarni, Miguel Terra-Neves, Vasco Manquinho, Ruben Martins

TL;DR
This paper investigates the failure modes of text-to-SQL tools and introduces a mutation-based repair method that improves the correctness of generated queries across different systems.
Contribution
It presents a novel, tool-agnostic mutation-based approach to repair incorrect text-to-SQL queries, enhancing their accuracy.
Findings
Repairs a significant number of failing queries
Effective across multiple text-to-SQL tools
Improves overall query correctness
Abstract
Data analysts use SQL queries to access and manipulate data on their databases. However, these queries are often challenging to write, and small mistakes can lead to unexpected data output. Recent work has explored several ways to automatically synthesize queries based on a user-provided specification. One promising technique called text-to-SQL consists of the user providing a natural language description of the intended behavior and the database's schema. Even though text-to-SQL tools are becoming more accurate, there are still many instances where they fail to produce the correct query. In this paper, we analyze when text-to-SQL tools fail to return the correct query and show that it is often the case that the returned query is close to a correct query. We propose to repair these failing queries using a mutation-based approach that is agnostic to the text-to-SQL tool being used. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Data Quality and Management
