An Agentic System for Schema Aware NL2SQL Generation
David Onyango, Naseef Mansoor

TL;DR
This paper introduces a schema-aware agentic NL2SQL system that uses small language models with a fallback to large models, significantly reducing costs while maintaining high accuracy in translating natural language to SQL.
Contribution
The proposed system strategically combines small and large language models to improve efficiency and reduce costs in NL2SQL tasks, addressing deployability concerns.
Findings
Achieves 47.78% execution accuracy on BIRD benchmark.
Reduces operational costs by over 90% compared to LLM-only systems.
Resolves approximately 67% of queries using local SLMs.
Abstract
The natural language to SQL (NL2SQL) task plays a pivotal role in democratizing data access by enabling non-expert users to interact with relational databases through intuitive language. While recent frameworks have enhanced translation accuracy via task specialization, their reliance on Large Language Models (LLMs) raises significant concerns regarding computational overhead, data privacy, and real-world deployability in resource-constrained environments. To address these challenges, we propose a schema based agentic system that strategically employs Small Language Models (SLMs) as primary agents, complemented by a selective LLM fallback mechanism. The LLM is invoked only upon detection of errors in SLM-generated output, the proposed system significantly minimizes computational expenditure. Experimental results on the BIRD benchmark demonstrate that our system achieves an execution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Natural Language Processing Techniques · Big Data and Digital Economy
