Text-to-SQL Calibration: No Need to Ask -- Just Rescale Model   Probabilities

Ashwin Ramachandran; Sunita Sarawagi

arXiv:2411.16742·cs.DB·November 28, 2024

Text-to-SQL Calibration: No Need to Ask -- Just Rescale Model Probabilities

Ashwin Ramachandran, Sunita Sarawagi

PDF

Open Access

TL;DR

This paper demonstrates that simply rescaling the model's full-sequence probability effectively calibrates confidence in generated SQL queries, outperforming more complex recent methods across multiple benchmarks and models.

Contribution

The study reveals that a basic probability rescaling approach surpasses recent self-checking calibration techniques in Text-to-SQL tasks.

Findings

01

Full-sequence probability rescaling outperforms recent calibration methods.

02

Simple baseline achieves better confidence calibration.

03

Evaluation across multiple benchmarks confirms effectiveness.

Abstract

Calibration is crucial as large language models (LLMs) are increasingly deployed to convert natural language queries into SQL for commercial databases. In this work, we investigate calibration techniques for assigning confidence to generated SQL queries. We show that a straightforward baseline -- deriving confidence from the model's full-sequence probability -- outperforms recent methods that rely on follow-up prompts for self-checking and confidence verbalization. Our comprehensive evaluation, conducted across two widely-used Text-to-SQL benchmarks and multiple LLM architectures, provides valuable insights into the effectiveness of various calibration strategies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management