Towards Optimizing SQL Generation via LLM Routing
Mohammadhossein Malekpour, Nour Shaheen, Foutse Khomh, Amine Mhedhbi

TL;DR
This paper proposes an LLM routing method for Text-to-SQL that dynamically selects the most cost-effective model for each query, maintaining accuracy while reducing latency and costs.
Contribution
It introduces the first LLM routing approach for Text-to-SQL, with two strategies that balance accuracy and cost effectively.
Findings
Routing strategies achieve comparable accuracy to larger LLMs
Significant reduction in latency and costs for simple queries
Effective accuracy-cost trade-off demonstrated on BIRD dataset
Abstract
Text-to-SQL enables users to interact with databases through natural language, simplifying access to structured data. Although highly capable large language models (LLMs) achieve strong accuracy for complex queries, they incur unnecessary latency and dollar cost for simpler ones. In this paper, we introduce the first LLM routing approach for Text-to-SQL, which dynamically selects the most cost-effective LLM capable of generating accurate SQL for each query. We present two routing strategies (score- and classification-based) that achieve accuracy comparable to the most capable LLM while reducing costs. We design the routers for ease of training and efficient inference. In our experiments, we highlight a practical and explainable accuracy-cost trade-off on the BIRD dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Database Systems and Queries · Service-Oriented Architecture and Web Services
