Prompt Engineering Techniques for Context-dependent Text-to-SQL in Arabic
Saleh Almohaimeed, May Alsofyani, Saad Almohaimeed, Mansour Al Ghanim, Liqiang Wang

TL;DR
This paper introduces Ar-SParC, the first Arabic cross-domain, context-dependent text-to-SQL dataset, and evaluates prompt engineering techniques and a novel GAT corrector to improve model performance in Arabic SQL generation.
Contribution
It presents Ar-SParC dataset for Arabic, explores prompt engineering with large language models, and proposes GAT corrector to enhance accuracy in Arabic text-to-SQL tasks.
Findings
GAT corrector improved execution and interaction accuracy by around 1.9%.
40 experiments conducted with GPT-3.5-turbo and GPT-4.5-turbo.
First Arabic dataset for cross-domain, context-dependent text-to-SQL.
Abstract
In recent years, the task of cross-domain, context-dependent text-to-SQL has received significant attention. Enables users with no prior knowledge of SQL to have a conversation with databases using natural language. However, most of the available datasets and research have been conducted in English, along with some work in Chinese. To this date, no effort has been made to address this task in the Arabic language. In this paper, we introduce Ar-SParC, the first Arabic cross-domain, context-dependent text-to-SQL dataset. The dataset consists of 3,450 sequences of interrelated questions, each sequence containing an average of approximately three questions, which results in a total of 10225 questions along with their corresponding SQL queries. We conducted 40 experiments on the Ar-SParC dataset using two large language models, GPT-3.5-turbo and GPT-4.5-turbo, applying 10 different prompt…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Web Application Security Vulnerabilities
