Meta-aware Learning in text-to-SQL Large Language Model

Wenda Zhang

arXiv:2505.18929·cs.AI·May 27, 2025

Meta-aware Learning in text-to-SQL Large Language Model

Wenda Zhang

PDF

Open Access

TL;DR

This paper introduces a meta-aware learning framework that enhances large language models for text-to-SQL tasks by integrating domain knowledge, schema, and reasoning strategies to improve SQL generation accuracy and robustness.

Contribution

It presents a novel meta-aware learning approach with four strategies to better incorporate database and domain knowledge into LLMs for improved SQL generation.

Findings

01

Improved execution accuracy in SQL generation

02

Enhanced multi-task SQL capabilities

03

Reduced catastrophic forgetting during training

Abstract

The advancements of Large language models (LLMs) have provided great opportunities to text-to-SQL tasks to overcome the main challenges to understand complex domain information and complex database structures in business applications. In this paper, we propose a meta-aware learning framework to integrate domain knowledge, database schema, chain-of-thought reasoning processes, and metadata relationships to improve the SQL generation quality. The proposed framework includes four learning strategies: schema-based learning, Chain-of-Thought (CoT) learning, knowledge-enhanced learning, and key information tokenization. This approach provides a comprehensive understanding of database structure and metadata information towards LLM through fine-tuning to improve its performance on SQL generation within business domains. Through two experimental studies, we have demonstrated the superiority of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Text and Document Classification Technologies · Topic Modeling