Before Generation, Align it! A Novel and Effective Strategy for   Mitigating Hallucinations in Text-to-SQL Generation

Ge Qu; Jinyang Li; Bowen Li; Bowen Qin; Nan Huo; Chenhao Ma; Reynold; Cheng

arXiv:2405.15307·cs.CL·May 27, 2024·1 cites

Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation

Ge Qu, Jinyang Li, Bowen Li, Bowen Qin, Nan Huo, Chenhao Ma, Reynold, Cheng

PDF

Open Access 1 Repo

TL;DR

This paper introduces Task Alignment (TA), a novel strategy to reduce hallucinations in text-to-SQL generation by leveraging similar task experiences, significantly improving model robustness and performance.

Contribution

The paper proposes TA, a new task alignment strategy, and develops TA-SQL, which effectively mitigates hallucinations and enhances performance across multiple models and benchmarks.

Findings

01

Improves GPT-4 baseline by 21.23% on BIRD dev

02

Yields significant gains across six models

03

Enhances robustness on complex benchmarks

Abstract

Large Language Models (LLMs) driven by In-Context Learning (ICL) have significantly improved the performance of text-to-SQL. Previous methods generally employ a two-stage reasoning framework, namely 1) schema linking and 2) logical synthesis, making the framework not only effective but also interpretable. Despite these advancements, the inherent bad nature of the generalization of LLMs often results in hallucinations, which limits the full potential of LLMs. In this work, we first identify and categorize the common types of hallucinations at each stage in text-to-SQL. We then introduce a novel strategy, Task Alignment (TA), designed to mitigate hallucinations at each stage. TA encourages LLMs to take advantage of experiences from similar tasks rather than starting the tasks from scratch. This can help LLMs reduce the burden of generalization, thereby mitigating hallucinations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

quge2023/TA-SQL
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Digital Economy · Advanced Malware Detection Techniques · Misinformation and Its Impacts

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Position-Wise Feed-Forward Layer · Multi-Head Attention · Dropout · Dense Connections