Confidence Estimation for Error Detection in Text-to-SQL Systems

Oleg Somov; Elena Tutubalina

arXiv:2501.09527·cs.LG·April 23, 2025

Confidence Estimation for Error Detection in Text-to-SQL Systems

Oleg Somov, Elena Tutubalina

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper explores confidence estimation and selective classification techniques to improve error detection and calibration in Text-to-SQL systems, enhancing their robustness and interpretability.

Contribution

It introduces entropy-based selective classifiers and calibration methods to improve error detection and confidence alignment in Text-to-SQL models, with empirical evaluation across different architectures.

Findings

01

Encoder-decoder T5 is better calibrated than GPT-4 and Llama 3.

02

Selective classifiers effectively detect errors in irrelevant questions.

03

Calibration techniques improve model confidence and accuracy alignment.

Abstract

Text-to-SQL enables users to interact with databases through natural language, simplifying the retrieval and synthesis of information. Despite the success of large language models (LLMs) in converting natural language questions into SQL queries, their broader adoption is limited by two main challenges: achieving robust generalization across diverse queries and ensuring interpretative confidence in their predictions. To tackle these issues, our research investigates the integration of selective classifiers into Text-to-SQL systems. We analyse the trade-off between coverage and risk using entropy based confidence estimation with selective classifiers and assess its impact on the overall performance of Text-to-SQL models. Additionally, we explore the models' initial calibration and improve it with calibration techniques for better model alignment between confidence and accuracy. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

runnerup96/error-detection-in-text2sql
noneOfficial

Videos

Confidence Estimation for Error Detection in Text-to-SQL Systems· underline

Taxonomy

TopicsService-Oriented Architecture and Web Services

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Discriminative Fine-Tuning · Cosine Annealing · Adam · Dropout · SentencePiece · Softmax · Byte Pair Encoding · Linear Layer