Confidence Estimation for Text-to-SQL in Large Language Models

Sepideh Entezari Maleki; Mohammadreza Pourreza; Davood Rafiei

arXiv:2508.14056·cs.CL·August 21, 2025

Confidence Estimation for Text-to-SQL in Large Language Models

Sepideh Entezari Maleki, Mohammadreza Pourreza, Davood Rafiei

PDF

Open Access 1 Video

TL;DR

This paper investigates confidence estimation methods for text-to-SQL tasks using large language models, comparing black-box and white-box strategies, and demonstrating the benefits of execution grounding and syntax-aware approaches.

Contribution

It introduces and evaluates confidence estimation techniques for LLM-generated SQL queries, highlighting the effectiveness of consistency-based and syntax-aware methods across domains.

Findings

01

Consistency-based black-box methods outperform others.

02

SQL-syntax-aware white-box approaches improve interpretability.

03

Execution grounding enhances confidence estimation accuracy.

Abstract

Confidence estimation for text-to-SQL aims to assess the reliability of model-generated SQL queries without having access to gold answers. We study this problem in the context of large language models (LLMs), where access to model weights and gradients is often constrained. We explore both black-box and white-box confidence estimation strategies, evaluating their effectiveness on cross-domain text-to-SQL benchmarks. Our evaluation highlights the superior performance of consistency-based methods among black-box models and the advantage of SQL-syntax-aware approaches for interpreting LLM logits in white-box settings. Furthermore, we show that execution-based grounding of queries provides a valuable supplementary signal, improving the effectiveness of both approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Confidence Estimation for Text-to-SQL in Large Language Models· underline

Taxonomy

TopicsScientific Computing and Data Management · Data Quality and Management · Advanced Database Systems and Queries