Natural language to SQL in low-code platforms
Sofia Aparicio, Samuel Arcadinho, Jo\~ao Nadkarni, David Apar\'icio,, Jo\~ao Lages, Mariana Louren\c{c}o, Bart{\l}omiej Matejczyk, Filipe, Assun\c{c}\~ao

TL;DR
This paper presents a pipeline for translating natural language into SQL queries in low-code platforms, including data collection, model training, and a feedback loop for continuous improvement, significantly enhancing user adoption and engagement.
Contribution
It introduces a comprehensive pipeline with a feedback loop for NL-to-SQL translation, tailored for low-code platforms, and demonstrates substantial improvements through A/B testing.
Findings
240% increase in feature adoption
220% increase in engagement rate
90% reduction in failure rate
Abstract
One of the developers' biggest challenges in low-code platforms is retrieving data from a database using SQL queries. Here, we propose a pipeline allowing developers to write natural language (NL) to retrieve data. In this study, we collect, label, and validate data covering the SQL queries most often performed by OutSystems users. We use that data to train a NL model that generates SQL. Alongside this, we describe the entire pipeline, which comprises a feedback loop that allows us to quickly collect production data and use it to retrain our SQL generation model. Using crowd-sourcing, we collect 26k NL and SQL pairs and obtain an additional 1k pairs from production data. Finally, we develop a UI that allows developers to input a NL query in a prompt and receive a user-friendly representation of the resulting SQL query. We use A/B testing to compare four different models in production…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Database Systems and Queries · Scientific Computing and Data Management
