SQL-ASTRA: Alleviating Sparse Feedback in Agentic SQL via Column-Set Matching and Trajectory Aggregation
Long Li, Zhijian Zhou, Jiangxuan Long, Peiyang Liu, Weidi Xu, Zhe Wang, Shirui Pan, Chao Qu

TL;DR
This paper introduces Agentic SQL, a reinforcement learning framework with a two-tiered reward system that improves multi-turn Text-to-SQL tasks by addressing credit assignment and reward sparsity, leading to better performance.
Contribution
The paper proposes a novel two-tiered reward mechanism with ATR and CSMR for multi-turn Text-to-SQL, providing dense feedback and theoretical guarantees for convergence.
Findings
5% gain over binary-reward GRPO on BIRD
Outperforms SOTA Arctic-Text2SQL-R1-7B on BIRD and Spider 2.0
Guarantees cycle-free policy and monotonic convergence
Abstract
Agentic Reinforcement Learning (RL) shows promise for complex tasks, but Text-to-SQL remains mostly restricted to single-turn paradigms. A primary bottleneck is the credit assignment problem. In traditional paradigms, rewards are determined solely by the final-turn feedback, which ignores the intermediate process and leads to ambiguous credit evaluation. To address this, we propose Agentic SQL, a framework featuring a universal two-tiered reward mechanism designed to provide effective trajectory-level evaluation and dense step-level signals. First, we introduce Aggregated Trajectory Reward (ATR) to resolve multi-turn credit assignment. Using an asymmetric transition matrix, ATR aggregates process-oriented scores to incentivize continuous improvement. Leveraging Lyapunov stability theory, we prove ATR acts as an energy dissipation operator, guaranteeing a cycle-free policy and monotonic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Reinforcement Learning in Robotics · Advanced Database Systems and Queries
