TL;DR
EGRefine is a framework that improves Text-to-SQL accuracy by iteratively refining database schemas through execution-grounded renaming, ensuring safe and effective schema updates.
Contribution
The paper introduces a novel, safe, and effective schema refinement pipeline for Text-to-SQL that maximizes execution accuracy while preserving query semantics.
Findings
EGRefine recovers accuracy lost due to schema noise.
The approach ensures query equivalence through view-based materialization.
Refined schemas transfer across different Text-to-SQL models.
Abstract
Text-to-SQL enables non-expert users to query databases in natural language, yet real-world schemas often suffer from ambiguous, abbreviated, or inconsistent naming conventions that degrade model accuracy. Existing approaches treat schemas as fixed and address errors downstream. In this paper, we frame schema refinement as a constrained optimization problem: find a renaming function that maximizes downstream Text-to-SQL execution accuracy while preserving query equivalence through database views. We analyze the computational hardness of this problem, which motivates a column-wise greedy decomposition, and instantiate it as EGRefine: a four-phase pipeline that screens ambiguous columns, generates context-aware candidate names, verifies them through execution-grounded feedback, and materializes the result as non-destructive SQL views. The pipeline carries two structural properties:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
