ReViSQL: Achieving Human-Level Text-to-SQL
Yuxuan Zhu, Tengjun Jin, Yoojin Choi, Daniel Kang

TL;DR
ReViSQL is a streamlined framework that achieves human-level accuracy in Text-to-SQL translation by improving data quality and employing reinforcement learning, without requiring complex architectures.
Contribution
The paper introduces ReViSQL, a simple yet effective approach that surpasses previous models in Text-to-SQL accuracy through data verification and inference-time scaling.
Findings
ReViSQL-235B-A22B achieves 93.2% accuracy on BIRD Mini-Dev.
Improving data quality boosts accuracy by 8.2-13.9%.
ReViSQL-30B-A3B matches SOTA at lower cost.
Abstract
Translating natural language to SQL (Text-to-SQL) is a critical challenge in both database research and data analytics applications. Recent efforts have focused on enhancing SQL reasoning by developing large language models and AI agents that decompose Text-to-SQL tasks into manually designed, step-by-step pipelines. However, despite these extensive architectural engineering efforts, a significant gap remains: even state-of-the-art (SOTA) AI agents have not yet achieved the human-level accuracy on the BIRD benchmark. In this paper, we show that closing this gap does not require further architectural complexity, but rather clean training data to improve SQL reasoning of the underlying models. We introduce ReViSQL, a streamlined framework that achieves human-level accuracy on BIRD for the first time. Instead of complex AI agents, ReViSQL leverages reinforcement learning with verifiable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
