SpotIt+: Verification-based Text-to-SQL Evaluation with Database Constraints
Andrew Tremante, Yang He, Rocky Klopfenstein, Yuepeng Wang, Nina Narodytska, Haoze Wu

TL;DR
SpotIt+ is an open-source verification tool that evaluates Text-to-SQL systems by actively finding database instances that reveal differences between generated and ground truth queries, using constraints mined with LLM validation.
Contribution
It introduces a constraint-mining pipeline combining rule-based and LLM validation to generate realistic test databases for more effective Text-to-SQL evaluation.
Findings
SpotIt+ uncovers discrepancies missed by standard evaluation methods.
Mined constraints lead to more realistic differentiating databases.
The approach improves evaluation accuracy on the BIRD dataset.
Abstract
We present SpotIt+, an open-source tool for evaluating Text-to-SQL systems via bounded equivalence verification. Given a generated SQL query and the ground truth, SpotIt+ actively searches for database instances that differentiate the two queries. To ensure that the generated counterexamples reflect practically relevant discrepancies, we introduce a best-effort constraint-mining pipeline that combines rule-based specification mining with LLM-based validation over example databases. Experimental results on the BIRD dataset show that the mined constraints enable SpotIt+ to generate more realistic differentiating databases, while preserving its ability to efficiently uncover numerous discrepancies between generated and gold SQL queries that are missed by standard test-based evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
