APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL
Bowen Cao, Weibin Liao, Yushi Sun, Dong Fang, Haitao Li, Wai Lam

TL;DR
APEX-SQL introduces an agentic exploration framework for Text-to-SQL tasks, enabling large language models to better handle complex, real-world databases through hypothesis verification and data grounding, improving accuracy and efficiency.
Contribution
It presents a novel agentic exploration approach that enhances Text-to-SQL systems by integrating logical planning, data validation, and exploration directives, addressing limitations of static schema reliance.
Findings
Achieves 70.65% execution accuracy on BIRD benchmark.
Outperforms baselines with reduced token consumption.
Agentic exploration significantly boosts model reasoning in enterprise data environments.
Abstract
Text-to-SQL systems powered by Large Language Models have excelled on academic benchmarks but struggle in complex enterprise environments. The primary limitation lies in their reliance on static schema representations, which fails to resolve semantic ambiguity and scale effectively to large, complex databases. To address this, we propose APEX-SQL, an Agentic Text-to-SQL Framework that shifts the paradigm from passive translation to agentic exploration. Our framework employs a hypothesis-verification loop to ground model reasoning in real data. In the schema linking phase, we use logical planning to verbalize hypotheses, dual-pathway pruning to reduce the search space, and parallel data profiling to validate column roles against real data, followed by global synthesis to ensure topological connectivity. For SQL generation, we introduce a deterministic mechanism to retrieve exploration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Scientific Computing and Data Management · Advanced Database Systems and Queries
